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IDENTIFICATION OF NEW SMALL UN As AND ORFs OF E. COLI AS 
MEDIATORS OF CELL AND INTERCELL REGULATION 

Field of the Invention 

5 The invention relates to new small RNAs and ORFs ofE. coli as mediators of cell 

and intercell regulation. 

BackCTOimd of the Invention 
In the last few years, the importance of regulatory small KNAs (sKNAs) as 
mediators of a number of cellular processes in bacteria has begxm to be recognized. 

10 Although instances of naturally occurring antisense RNAs have been known for many 
years, the participation of sRNAs in protein tagging for degradation, modulation of RNA 
polymerase activity, and stimulation of translation are relatively recent discoveries (see 
Wassannan, K.M. et aL 1999 Trends Microbiol 7:37-45 for review; Wassarman, K.M. and 
Storz, G. 2000 Cell 101:613-623). These findings have raised questions about how 

15 extensively sRNAs are used, what other cellular activities might be regulated by sKNAs, 
and what other mechanisms of action exist for sRNAs. In addition, prokaryotic sRNAs 
appear to target different cellxilar functions than their eukaiyotic counterparts that primarily 
act during RNA biogenesis. It is unclear whether this difference between prokaryotic and 
eukaryotic sRNAs is accurate or stems from the incompleteness of current knowledge, 

20 Implicit in these questions is the question of how many sRNAs exist in a given organism 
and whether the current known sRNAs are truly representative of sRNA ftmction in 
general. 

To date, most known bacterial sRNAs have been identified fortuitously by the direct 
detection of highly abundant sRNAs (4.5S RNA, tmRNA, 6S RNA, RNase P RNA, and 

25 Spot42 PJSfA), by the observation of an sRNA during studies on proteins (OxyS RNA, Cip 
Tic RNA, CsrB RNA, and GcvB RNA) or by the discovery of activities associated with 
overexpression of genomic fragments OVIicF RNA, DicF RNA, DsrA RNA, and RprA 
RNA) (Okamoto, K. and Freundlich, M. 1986 PNAS USA 83:5000-5004; Bhasin, R.S. 1989 
Studies on the mechanism of the autoregulation of the cip operon of E. coU K12 In: Dept. 

30 of Biochemistry and Cell Biology, State University of New York at Stonybrook; 

Urbanowski, M.L. et al. 2000 Mol Microbiol 37:856-868; Wassarman, KM. and Storz, G. 
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2000 Cell 101:613-623; Majdalani, N. et al 2001 Mol Microbiol 39:1382-1394; for review 
see Wassannan, SLM. et al 1999 Trends Microbiol 7:37-45). None of the E. coli sRNAs 
w^e found as a result of mutational screens. This observation may reflect the small target 
size of genes encoding sKNAs compared to protein genes, or may be a consequence of the 
5 regulatory rather than essential nature of many sRNA functions. The complete genome 
sequence of an organism provides a rapid inventory of most encoded proteins, tRNAs, and 
rRNAs, but it has not led to the immediate recognition of other genes that are not 
translated. In particular, new bacterial sRNA genes have been overlooked, as there are no 
identifiable classes of sRNAs that can be found based solely on sequence determinants. 

10 Segue to t he Summ arv of the Invention 

We and others have previously suggested several approaches to look for new 
sRNAs including computer searching of complete genomes based on parameters common 
to sRNAs, probing of genomic microarrays, and isolatmg sRNAs based on an association 
with general RNA binding proteins (Wassannan, K.M. et al. 1999 Trends Microbiol 7:37- 

15 45; Eddy, S.R. 1999 Curr Opin Genet Dev 9:695-699). Using a combination of these 
approaches, we have identified 17 novel sRNAs; in addition, we have found six small 
transcripts that contain short conserved open reading j&ames (ORFs). 

Simimarv of the Invention 
A burgeoning list of small RNAs with a variety of regulatory functions has been 

20 identified in both prokaryotic and eukaryotic cells. However, it remains difficult to identify 
small RNAs by sequence inspection. We utilized the high conservation of small RNAs 
among closely related bacterial species, as well as analysis of transcripts detected by high- 
density oUgonucleotide probe arrays, to predict the presence of novel small RJSf A genes in 
the intergenic regions of the Escherichia coli genome. The existence of 23 distinct new 

25 RNA species was confirmed by Northern analysis. Of these, six are predicted to encode 
short ORFs, whereas 17 are novel functional small RNAs. Based on the interaction of 
these small RNAs with the RNA binding protein Hfq, the modxilation of rpoS expression, 
and other information, we contemplate these new small RNAs and ORFs of E. coli as 
mediators of cell and intercell regulation. As such, we anticipate their use in the 

30 development of diagnostics and in the development of antibiotics. 
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Brief Description of the Drawings 

Figure 1 shows BLAST alignments of representative Ig regions. The indicated Ig regions 
were used in a BLAST search of the NCBI Unfinished Microbial Genomes database. Each 
panel shows the summary figure provided by the BLAST program for matches to 
5 Salmonella enteritidis. Salmonella paratyphi A, Salmonella typhi, Sahnonella typhimurium 
LT2 and Klebsiella pneumoniae, toee contain known sRNA genes {rprA, csrB, and oj^yS), 
and four contain sRNA candidates (#14, #17, #52, and #36; see Table 1). For each panel, 
the center numbered line represents the length of the fiill Ig region; the orientation of 
flanking genes is given by > (clockwise) or < (counterclockwise). The top hatched line in 

10 each panel is the match to E, coli (full identity throughout the Ig). The other hatched or 
double-diagonal lines resulted firom the closest matches, and the other lines indicate 
additional less homologous matches. Location of the conserved region with respect to the 
borders of the Ig region also was a criterion used for the selection of our candidates; 
conservation 3' to an ORF or far fi-om the 5' start of an ORF was considered more likely to 

15 encode an sRNA. Note that the conservation within the Ig region encoding oxyS might be 
interpreted as a leader sequence based on location relative to the start of the flanking gene 
ioxyR). However, the conservation extends for 185 nt, and therefore candidate regions in 
our search in which the conservation was near the start of an ORF but was longer than 150 
nt were considered fiirther. 

20 Figure 2 is the expression profile across high-density oligonucleotide arrays for 
representative Ig regions. Probe intensities are shown for the indicated Ig regions (solid 
bars) and the flanking ORFs (hatched bars), calculated from the perfect match minus the 
mismatch intensities. All negative differences were set to zero. The data shown are for one 
e^qjeriment using cDNA probes, but similar results were seen in the duplicate e3cperiment 

25 and with directly labeled KNA probes. The Ig regions and each flanking gene generally 
contain 15 interrogating probes. Upward bars correspond to genes transcribed on the 
Watson (W, clockwise) strand and downward bars correspond to genes transcribed on the 
Crick (C, counterclockwise) strand. The C strand signal for the CsrB Ig region corresponds 
well with the known location of the csrB gene. Similarly for the RprA Ig region, the W 

30 strand signal corresponds with the location of the rprA gene, but only one probe is positive. 
The W strand signal for #14 and the C strand signal for #17 overlap well with the 
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conserved regions shown in the BLAST analysis in Figure 1. #36 was chosen for further 
analysis because of the strong C strand signal; both flanking ORFs are on the W strand. 
For #52, low levels of expression were seen on both strands; the very low level for probes 
in the middle of the Ig on the C strand overlapped best with the conserved region found by 
5 the BLAST searches (Figure 1). 

Figure 3 shows detection of novel sRNAs by Northern hybridization. Northern 
hybridization using strand specific probes for each candidate was done on RNA extracted 
from MG1655 cells grown under three different growth conditions: (E), exponential growth 
in LB medium; (M), exponential growth in M63-glueose medium, and (S) stationary phase 

10 in LB medium. Five p.g of total RNA was loaded in each lane. Exposure times were 
optimized for each panel for visuaKzation here, therefore the signal intensity shown does 
not indicate relative abundance between sRNAs. Oligonucleotide probes were used for 
#12, #22, #55-1, #55-1, and #61; RNA probes were used for all other panels. DNA 
molecular weight markers (5 '-end-labeled Mspl digested pBR322 DNA) were run with 

15 each set of samples for direct estimation of RNA transcript length. One lane of DNA 
molecular weight markers are shown for comparison, but these are approximate sizes as 
there was slight variation in running of gels. 

Figure 4 shows results of coimmunoprecipitation of sRNAs with the Hfq protein. {A) 
Lnmunoprecipitations using extract from MG1655 cells grown in LB medium in 

20 exponential growth (ODeoo=0.4) were done using no antibody (lane 1); 5 fxl of preimmune 
serum (lane 2); or 0.5, 1, 5, or 10 ^1 of hfq antisera (lanes 3-6). Selected RNAs were 
fractionated on a 10% polyacrylamide urea gel after 3 '-end labeling. Asterisks mark RNA 
bands present in the anti-hfq precipitated samples but not in the preimmime control samples 
and therefore represent Hfq-interacting RNAs. (5) hnmimoprecipitations were done usiug 

25 extract from MG1655 cells grown under three different growth conditions: (E) exponential 
growth in LB medium; (M) exponential growth iu M63-glucose medium, and (S) stationary 
phase in LB medium. Immunoprecipitations were carried out with 5 \x\ of preimmune sera 
(lane 1) or 5 \i\ Hfq antisera (lane 2) and compared to total RNA from 1/10 extract 
eqxdvalent losed in the immunoprecipitations (lane 3). RNAs were fractionated on 10% 

30 polyacrylamide urea gels and analyzed by Northern hybridization using RNA probes to 
previously known sRNAs or our novel RNAs as indicated. 
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Brief Description of the Sequences 



Candidate 
Number 


SEQ ID NO 


12 


1 


14 


2 


22 


3 


24 


4 


25 


5 


26 


6 


27 


7 


31 


8 


38 


9 


40 


10 


41-1 


11 


41-n 


12 


52-1 


13 


52-n 


14 


55-1 


15 


55-n 


16 


61 


17 


8 


18 


43 


19 


9 (nucleotide) 


20 


9 (amino acid) 


21 


17 (nucleotide) 


22 


17 (amino acid) 


23 


28 (nucleotide) 


24 


28 (ajnino acid) 


25 


36(nucleotide) 


26 


36 (amino acid) 


27 


49 (nucleotide) 


28 


49 (amino acid) 


29 
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50 (nucleotide) 


30 


50 (amino acid) 


31 



Detailed Description of the Preferred Embodiment 
By "RNA" or "gene product" or "transcription product" is meant the RNA encoded 
by the E. coli gene or RNA substantially homologous or complementary thereto or a 
derivative or fragment thereof having KNA activity. Encompassed by the definition of 
5 "RNA" are variants of RNA in which there have been trivial mutations such as 
substitutions, deletions, insertions or other modifications of the native RNA. The terai 
"substantial homology" or "substantial identity", when referring to polypeptides or 
polynucleotides, indicates that the sequence of a polypeptide or polynucleotide in question, 
when properly aUgned, exhibits at least about 30% identity with the sequence of an entire 

10 naturally occurring polypeptide or polynucleotide or a portion thereof. Polynucleotides of 
the present invention which are homologous or substantially homologous to, for example, 
tbe polynucleotides of the invention are usually at least about 70% identity to that shown in 
the Sequence Listing, preferably at least about 90% identity and most preferably at least 
about 95% identity, or a complement thereof. Any technique known in the art can be used 

15 to sequence polynucleotides, including, for example, dideoxynucleotide sequencing 
(Sanger et al. 1977 PNAS USA 74:5463-5467), or using the Sequenase™ kit (United States 
Biochemical Corp.). Homologs of polynucleotides and polypeptides, whether synthetically 
or recombinantly produced or found in nature, are also encompassed by the scope of the 
invention, and are herein defined as polynucleotides and polypeptides which are 

20 homologous to, respectively, polynucleotides and polypeptides of the invention, or 
fragments, variants, or complements thereof Homologous polynucleotides and 
polypeptides are generaUy encoded by homologous genes as described above, and retain 
significant amiao acid residue or nucleotide identity to the genes of the invention. Such 
polypeptides can be expressed by pther organisms such as bacteria, yeast and higher order 

25 organisms such as mammals. Various methods of determining amino acid residue or 

nucleotide identity are known in the art. Homologous polynucleotides or polypeptides can 

be obtained by in ^ntro synthesis by expressing genes derived from other bacteria or by 

mutagenizing genes of the invention. Also included in the definition of "substantially 

homologous polynucleotides" would be those polynucleotides which, when annealed imder 

-6- 
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conditions known in the art, would remain annealed under moderate wash conditions also 
known in the art (such as washing in 6x SSPE twice at room temperature and then twice at 
(Wahl et al, 1987 Methods in Enzymolosy 152 Academic Press Inc., San Diego). 
Polynucleotide and polypeptide homology is typically meastired using sequence 

5 analysis software. See, e.g.. Sequence Analysis Software Package of the Genetics 
Computer Group, University of Wisconsin Biotechnology Center, 1710 University Avenue, 
Madison, Wis. 53705. 

By "polynucleotide" or "nucleic acid" is meant a single- or double-stranded DNA, 
genomic DNA, cDNA, KNA, DNA-RNA hybrid, or a polymer comprising purine and 

10 pyrimidine bases, or other natural, chemically or biochemically modified or containing 
non-natural or derivatized nucleotide bases. The backbone of the polynucleotide can 
comprise sugars and phosphate groups (as typically foimd in RNA or DNA), or modified or 
substituted sugar or phosphate groups. Alternatively, the backbone of the polynucleotide 
can comprise a polymor of synthetic subunits such as phosphoramidates and is thus a 

15 oUgodeoxynucleoside phosphoramidate (P-NHj) or a mixed phosphoramidate- 
phosphodiester oUgomer (Peyrottes et al. 1996 Nucleic Acids Res 24:1841-8; Chaturvedi et 
al 1996 Nucleic Acids Res 24:2318-23; and Schultz et al 1996 Nucleic Acids Res 24:2966- 
73). In another embodiment, a phosphorothiate linkage can be used in place of a 
phosphodiester linkage (Braun et al 1988 J Immunol 141:2084-9; and Latimer et al 1995 

20 Mol Immunol 32:1057-1064). In addition, a double-stranded polynucleotide can be 
obtauied firom the single-stranded polynucleotide product of chemical synthesis either by 
synthesizing the complementary strand and annealing the strands imder ^propriate 
conditions, or by synthesizing the complementary strand de novo using a DNA polymerase 
with an appropriate primer. 

25 A nucleic acid is said to "encode" an RNA or a polypeptide if, in its native state or 

when manipulated by methods known to those skilled in the art, it can be transcribed and/or 
translated to produce the RNA, the polypeptide or a firagment thereof. The anti-sense 
strand of such a nucleic acid is also said to encode the sequence. The polynucleotides of 
the present invention comprise those which are naturally-occurring, synthetic or 

30 recombinant 
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A "recombinant" nucleic acid is one which is chemically synthesized or the product 
of the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic 
engineering techniques. Isolated segments within a recombinant nucleic acid can be 
• naturally occurring sequences. 
5 By "polynucleotide" or "gebe" or "RNA" and the like is meant a polynucleotide 

encoding or comprising the RNA of the invention, or a homolog, fragment, derivative or 
complement thereof and having RNA activity as described herein. As is known in the art, a 
DNA can be transcribed by an RNA polymerase to produce RNA, but an RNA can be 
reverse transcribed by reverse transcriptase to produce a DNA. Thus a DNA can encode an 

1 0 RNA and vice versa. 

The invention also encompasses vectors such as single- and double-stranded 
plasmids or viral vectors comprising RNA, DNA or a mixture or variant thereof, further 
comprising a polynucleotide of the invention. A wide variety of suitable expression 
systems are known in the art and are selected based on the host cells used, inducibility of 

15 expression desired and ease of use. The non-transcribed portions of a gene and the non- 
coding portions, of a gene can be modified as known in the art. For example, the native 
promoters can be deleted, substituted or supplemented with other promoters known in the 
art; transcriptional enhancers, inducible promoters or other transcriptional control elements 
can be added, as can be replication origins and replication initiator proteins, autonomously 

20 replicating sequence (ARS), marker genes (e.g. antibiotic resistance markers), sequences 
for chromosomal integration (e.g., viral integration sites or sequences homologous to 
chromosomal sequences), restriction sites, multiple cloning sites, ribosome-binding sites, 
RNA splice sites, polyadenylation sites, transcriptional temainator sequences, mRNA 
stabilizing sequences, 5' stem-loop to protect against degradation, and other elements 

25 commonly found on plasmids and other vectors known in tiae art. Secretion signals from 
secreted polypeptides can also be included to allow the polypeptide to cross and/or lodge in 
cell membranes or be secreted from the cell. Such vectors can be prepared by means of 
standard recombinant techniques discussed, for example, in Sambrook et aL 1989 
Molecular Cloning: A Laboratory Manual, 2nd edition. Cold Spring Harbor Press, Cold 

30 Spring Harbor Laboratory, N.Y.; and Ausubel et al (eds.), 1987 Current Protocols in 
Molecular Biology^ Greene Publishing Associates, Brooklyn, N.Y.). Many usefiil vectors 
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are known in the art and can be obtained from vendors including, but not limited to, 
Stratagene, New England Biolabs, and Promega Biotech. 

An appropriate promoter and other necessary vector sequences are selected so as to 
be functional in the chosen host. While prokaryotic host cells are preferred, manraialian or 
5 other eukaryotic host cells, including, but not limited to, yeast, filamentous fungi, plant, 
insect, amphibian or avian species, can also be useful for production of the polypeptides of 
the present invention. See, Kruse et aL (eds.) 1973 Tissue Culture Academic Press. 
Examples of workable combinations of cell lines and expression vectors are described in 
Sambrook et al 1989 or Ausubel et aL 1987; see also, e.g., Metzger et aL 1988 Nature 

10 334:31-36. Examples of commonly used mammalian host cell lines are VERO and HeLa 
cells, Chinese hamster ovary (CHO) cells, and W138, BHK, and COS cell lines, or others 
as appropriate, e.g., to provide higher expression, desirable glycosylation patterns, etc. 

By "bacterial host cell" or "bacteria" or "bacterium" is meant various micro- 
organism(s) containing at least one chromosome but lacking a discrete nuclear membrane. 

15 Representatives include E, coli. Bacillus, Salmonella^ Pseudomonas, Staphylococcus and 
other eubacteria, archaebacteria, chlamydia and rickettsia and related organisms, and the 
like, and may be spherical, rod-like, straight, curved, spiral, filamentous or other shapes. 

Vectors suitable for use with various cells can comprise promoters which can, when 
appropriate, include those naturally associated with genes of the invention. Promoters can 

20 be operably liiiked to a polynucleotide of the invention. 

A nucleic acid sequence is "operably Unked" when it is placed into a functional 
relationship with another nucleic acid sequence. For instance, a promoter is operably 
linked to a coding sequence if the promoter affects the transcription or expression of the 
gene. Generally, operably linked means that the DNA sequences being linked are 

25 contiguous and, where necessary to join two protein coding regions, contiguous and in 
reading jframe. 

Promoters can be inducible or repressible by factors which respond biochemically 
to changes in temperature, osmolality, carbon source, sugars, etc., as is known in the art. 
Promoters including, but not limited to, the tip, lac and phage promoters, tRNA promoters 
30 and glycolytic enzyme promoters can be used in prokaryotic hosts. Useful yeast promoters 
include, but are not limited to, the promoter regions for metallothionein, 3- 
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phosphoglycerate kinase or other glycolytic enzymes such as enolase or glyceraldehyde-3- 
phosphate dehydrogenase, en2yines responsible for maltose and galactose utilization. 
Appropriate foreign mammaUan promotes include, but are not limited to, the early and late 
promoters from SV40 (Fiers et al 1978 Nature 273:113-120) and promoters derived from 
5 murine Moloney leukemia virus, mouse mammary tumor virus, avian sarcoma virases, 
adenovirus U, bovine papilloma virus or polyoma. In addition, the construct can be joined 
to an amplifiable gene (e.g., DHDFR) so that multiple copies of the construct can be made. 
For appropriate enhancer and other expression control sequences suitable for vectors, see 
also Enhancers and Eukaryotic Gene Expression, Cold Spring Harbor Press: N.Y. 1983. 

10 While expression vectors are preferably autonomously replicatiag, they can also be 

inserted into the genome of the host cell by methods known ia the art. Expression and 
cloning vectors preferably contain a selectable marker v/hich is a gene encoding a protein 
necessary under at least one control for the survival or growth of a host cell transformed 
with the vector. The presence of this gene ensures the growth of only those host cells 

15 which express the inserts. Typical selection genes are known in the art and include, but are 
not limited to, those which encode proteins that (a) confer resistance to antibiotics or other 
toxic substances, e.g., ampicillin, neomycin, methotrexate, etc.; (b) complement 
auxotrophic deficiencies, or (c) supply critical nutrients not available from complex media, 
e.g., the gene encoding D-alanine racemase for Bacilli. The choice of the proper selectable 

20 marker depends on the host cell, as appropriate markers for different hosts are well known. 

As one of skill in the art will understand, the choice in construction and 
arrangement of markers, promoters, origins of replication, etc. in various vectors of the 
present invention will be dictated by the deshred level and timing of expression of RNA of 
the invention, with tiie ultimate goal of regulating the production of metabolic compounds 

25 in the host cell. 

By "protein" or "polypeptide" is meant a polypeptide encoded by the E, coli gene of 
the invention or a polypeptide substantially homologous thereto and having protein activity. 
Encompassed by the proteins of the invention are variants thereof in which there have been 
trivial substitutions, deletions, insertions. or other modifications of the native polypeptide 
30 which substantially retain proteia characteristics, particularly silent or conservative 
substitutions. Silent nucleotide substitutions are changes of one or more nucleotides which 

-10- 
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do not change any amino acid of protein. Conservative substitutions include substitutions 
within the following groups: glycine, alanine; valine, isoleucine, leucine; aspartic acid, 
glutamic acid; asparagine, glutamine; serine, threonine; lysine, arginine; and phenylalanine, 
tyrosine. Such conservative substitutions are not expected to interfere with biochemical 
5 activity, particularly when they occur in structural regions (e.g., alpha helices or beta 
pleated sheets) of the polypeptide, which can be predicted by standard computer analysis of 
the amino acid sequence of the protein. Also encompassed by the claimed polypeptides of 
the invention are polypeptides encoded by polynucleotides which are , substantially 
homologoxis to a polynucleotide of the invention. 
10 Nucleic acids encoding the polypeptides of the present invention include not only 

native or wild-type sequences but also any sequence capable of encoding the polypeptide, 
which can be synthesized by making use of the redundancy in the genetic code. Various 
codon substitutions can be introduced, e.g., silent or conservative changes as discussed 
above. Due to degeneracy in the genetic code there is some degree of flexibility in the third 
15 base of each codon and some amino acid residues are encoded by several different codons. 
Each possible codon could be used in the gene to encode the protein. While this may 
appear to present innumerable choices, in practice, each host has a particular preferred 
codon usage, so that genes can be tailored for optimal translation in the host in which they 
are expressed. Thus, synthetic genes that encode the proteins of the invention are included 
20 in this invention. 

Techniques for nucleic acid ma3Qipulation are described generally, for example, in 
Sambrook et al (1989) and Ausubel et al (1987). Reagents useful in applying such 
techniques, such as restriction enzymes and the like, are widely known in the art and 
conomercially available from vendors including, but not limited to. New England BioLabs, 
25 Boehringer Mannhertn, Amersham, Promega Biotech, U.S. Biochemicals, New England 
Nuclear, and a number of other soxnrces. 

Nucleic acid probes and primers based on sequences of the invention can be 
prepared by standard techniques. Such a probe or primer comprises an isolated nucleic 
acid, ha the case of probes, the nucleic add fixrther comprises a label (e.g., a radionuchde 
30 such as ^^P ATP or ^*S) or a reporter molecule (e.g., a ligand such as biotin or an enzyme 
such as horseradish peroxidase). The [^^P>ATP, [^^S]-dATP and [^^S]-methionine can be 
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purchased, for example, from DuPont NEN (Wilmington, Del.). Probes can be used to 
identify the presence of a hybridizing nucleic acid sequence, e.g., an RNA in a sample or a 
cDNA or genomic clone in a library. Primers can be used, for example, for amplification of 
nucleic acid sequences, e.g., by the polymerase chain reaction (PCR). See, e.g., Innis et aL 
5 (eds.) 1990 PCR Protocols: A Guide to Methods and Applications, Academic Press: San 
Diego. The preparation and use of probes and primers is described, e.g., in Sambrook et aL 
(1989) or Ausubel et al (1987). The genes of homologs of RNA of the invention in other 
species can be obtained by generating cDNA from RNA from such species using any 
technique known in the art, such as using Riboclone cDNA Synthesis Systems AMV RT 
10 (Promega, Madison, Wis.), then probing such cDNA with radiolabeled primers containing 
various portions (e.g. 30 or 40 bases long) of the sequences disclosed herein. To obtain 
homologs of the proteins of the invention, degenerate primers can encode the amino acid 
sequence of the disclosed E. coli protein but differ in codon usage from the sequences 
disclosed. 

15 Antisense and ribozyme nucleic acids capable of specifically binding to sequences 

of the invention are also usefiil for interfering with gene expression. 

The nucleic acids of the present invention (whether sense or anti-sense, and whether 
encoding the genes of the invention, or a homolog, variant, fragment or complement 
thereof) can be produced in large amounts by replication of a suitable recombinant vector 

20 comprising DNA sequences in a compatible host cell. Altematively, these nucleic acids 
can be chemically synthesized, e.g., by any method known in the art, including, but not 
limited to, the phosphoramidite method described by Beaucage et aL 1981 Tetra Letts 
22:1859-1862, and the triester method according to Matteucci et aL 1981 J Am Chem Soc 
103:3 191, preferably using commercial automated synthesizers. The purification of nucleic 

25 acids produced by the methods of the present invention can be achieved by any method 
known in tihe art including, but not limited to, those described, e.g., in Sambrook et aL 
(1989), or Ausubel et aL (1987). Numerous commercial kits are available for DNA 
purification including Qiagen plasmid mini DNA cartridges (Chatsworth, Calif.). 

The nucleic acids of the present invention can be introduced into host cells by any 

30 method known in the art, which vaiy depending on the type of cellular host, including, but 
not limited to, electroporation; transfection employing calcium chloride, rabidium chloride 
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calcium phosphate, DEAE-dextran, or other substances; microprojectile bombardment; PI 
transduction; use of suicide vectors; lipofection; infection (where the vector is an infectious 
agent, such as a retroviral genome); and other methods. See generally, Sambrook et aL 
(1989), and Ausubel et aL (1987). The cells into which these nucleic acids have been 
5 introduced also include tiie progeny of such cells. 

A polypeptide "fragment", "portion", or "segment" is a stretch of amino acid 
residues of at least about 7 to 19 amino acids (or the minimum size retaiiung an antigenic 
determinant). A fragment of the present invention can comprise a portion of at least 20 
amino acids of the protein sequence, at least 30 amino acids of the protein sequence, at least 

10 40 amino acids of the protein sequence, at least 50 amino acids of the protein sequence, or 
all or substantially all of the protein sequence. In addition, the invention encompasses 
polypeptides which comprise a portion of the sequence of the lengths described in this 
paragraph, which further comprise additional amino acid sequences on the ends or in the 
middle of sequences. The additional amino acid sequences can, for example, comprise 

15 another protein or a ftmctional domain thereof, such as signal peptides, membrane-binding 
moieties, etc. 

A polynucleotide fragment of the present invention can comprise a polymer of at 
least six bases or basepairs. A fragment of the present invention can comprise at least six 
bases or basepairs, at least 10 bases or basepairs, at least twenty bases or basepairs, at least 

20 forty bases or base pairs, at least fifty bases or basepairs, at least one himdred bases or 
basepairs, at least one hundred fifty bases or basepairs, at least two hundred bases or 
basepairs, at least two hundred fifty bases or basepairs, at least three hundred bases or 
basepairs of the gene sequence. In addition, the invention encompasses polynucleotides 
which comprise a portion of the sequence of the lengths described in this paragraph, which 

25 further comprise additional nucleic acid sequences on the 5' or 3' end or inserted into the 
sequence. These additional sequences can, for example, encode a coding region of a gene 
or a functional domain th^:eof or a promoter. 

The terms "isolated", "pure", "substantially pure", and "substantially homogenous" 
are used interchangeably to describe a polypeptide, or polynucleotide which has been 

30 separated from components which naturally accompany it. A monomeric protein or a 
polynucleotide is substantially pure when at least about 60 to 75% of a sample exhibits a 
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single polypeptide or polynucleotide sequence. A substantially pure protein or 
polynucleotide typically comprises about 60 to 90% by weight of a protein or 
polynucleotide sample, more usually about 95%, and preferably will be over about 99% 
pure, 

5 Protein or polynucleotide purity or homogeneity may be indicated by a number of 

means, such as polyacrylamide gel electrophoresis of a sample, followed by visualizing a 
single band upon staining the gel. For certaia purposes higher resolution can be provided 
by using high perforaiance liquid chromatography (HPLC) or other means well known in 
the art for purification. 

10 An RNA or a protein is "isolated" whai it is substantially separated from the 

contaminants which accompany it in its natural state. Thus, a polypeptide which is 
chemically synthesized or expressed as a recombinant protein, i.e., an expression product of 
an isolated and manipulated genetic sequence, is considered isolated. A recombinant 
polypeptide is considered "isolated" even if expressed in a homologous ceU type. 

15 A polypeptide can be purified from cells in which it is produced by any of the 

purification methods known in the art. For example, such polypeptides can be purified by 
inamunoaffinity chromatography employing, e.g., the antibodies provided by the present 
invention. Various methods of protein purification include, but are not limited to, those 
described ia Guide to Protein Purification, ed. Deutscher, vol. 182 of Methods in 

20 Enzymology Academic Press, Inc., San Diego, 1990 and Scopes, 1982 Protein Purificatiofi: 
Principles and Practice Springer-Verlag, New York. 

Polypeptide fragments of the proteia of the invention are first obtained by digestion 
with enzymes such as trypsin, clostripain, or Staphylococcus protease, or with chemical 
agents such as cyanogen bromide, O-iodosobenzoate, hydroxylamine or 2-nitro-5- 

25 thiocyanobenzoate. Peptide fragments can be separated by reversed-phase HPLC and 
analyzed by gas-phase sequencing. Peptide fragments are used in order to determine the 
partial amino acid sequence of a polypeptide by methods known in the art including but not 
limited to, Edman degradation. 

The present invention also provides polyclonal and/or monoclonal antibodies 

30 capable of specifically binding to a polypeptide of the invention, or homolog, fragment, 
complement or derivative thereof. Antibodies can also be produced which bind specifically 
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to a polynucleotide of the invention, such as an RNA of the invention or homolog, 
fragment, complement or derivative thereof, and may be produced as described in, for 
example, Thiry 1994 Chromosoma 103:268-76; Thiry 1993 Eur J Cell Biol 62:259-69; 
Reines 1991 J Biol Chem 266:10510-7; Putterman et al 1996 J Clin Invest 97:2251-9; and 
5 Foumie 1996 Clin Exp Immunol 104:236-40. Antibodies capable of binding to 
polypeptides or polynucleotides of the invention can be usefiil in detecting protein, in 
titrating protein, for quantifying protein, for purifying protein or polynucleotide, or for 
other uses. 

For production of polyclonal antibodies, an appropriate host animal is selected, 

10 typically a mouse or rabbit. The substantially purified antigen, whether the whole 
polypeptide, a fragment, derivative, or homolog thereof, or a polypeptide coupled or ftised 
to another polypeptide, or polynucleotide or homolog, derivative, complanent or fragment 
thereof, is presented to the immune system of the host by methods appropriate for the host, 
commonly by injection into the foo^ads, intramuscularly, intraperitoneally, or 

15 intradennally. Peptide fragments suitable for raising antibodies can be prepared by 
chemical synthesis, and are couMiionly coupled to a carrier molecule (e.g., keyhole limpet 
hemocyania) and injected into a host over a period of time suitable for the production of 
antibodies. The sera are tested for immunoreactivity to the protein or fragment. 
Monoclonal antibodies can be made by injecting the host with the protein polypeptides, 

20 fusion proteins or fragments thereof and following methods known in the art for production 
of such antibodies (Harlow et al. 1988 Antibodies: A Laboratory Manual, Cold Spring 
Harbor Laboratories). 

An inununological response is usually assayed with an immunoassay, a variety of 
which are provided, e.g., in Harlow et al 1988, or Groding 1986 Monoclonal Antibodies: 

25 Principles and Practice, 2d ed.. Academic Press, New York), although any method known 
in the art can be used. 

Monoclonal antibodies with afOnities of 10* M preferably 10' to 10'^, or stronger 
are made by standard procedures as described, e,g., in Harlow et al. 1988, or Goding 1986. 
Briefly, appropriate animals are immunized with the antigen by a standard protocol. After 

30 the appropriate period of time, the spleras of such animals are excised and individual 
spleen cells fused to immortalized myeloma cells. Thereafter the cells are clonally 
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separated and the supematants of each clone are tested for their production of an 
appropriate antibody specific for the desired region of the antigen. 

Other suitable techniques of antibody production include, but are not limited to, in 
vitro exposure of lymphocytes to the antigenic polypeptides or selection of libraries of 
5 antibodies in phage or similar vectors (Huse et al. 1989 Science 246:1275-1281). 

Frequently, the polypeptides and antibodies are labeled, either covalently or 
noncovalently, with a substance which provides for a detectable signal. A wide variety of 
labels and conjugation techniques are known. Suitable labels include, but are not limited 
to, radionuclides, enzymes, substrates, cofactors, inhibitors, fluorescent agents, 
10 chemiluminescent agents, magnetic particles. Also, recombinant immunoglobulins can be 
produced by any method known in the art. 

Identificatio n of Novel Smali RNAs Using Comparative Genomics and Microarravs 

As a starting point for detecting novel sRNAs in E. co% we considered a number of 
common properties of the previously identified sRNAs that anigiht serve as a guide to 

15 identify genes encoding new sRNAs. We are defining sRNA as relatively short RNAs that 
do not fimction by encoding a complete ORF. Of the 13 small RNAs known when this 
work begun, we were struck by the high conservation of these genes between closely 
related organisms. In most cases, the conservation between E, coli and Salmonella was 
above 85%, whereas that of the typical gene encoding an ORF was firequently below 70%. 

20 Conservation tests on random noncoding regions of the genome suggested that extended 
conservation in intergenic regions was unusual enough to be used as an initial parameter to 
screen for new sRNA genes. We therefore tested this approach to look for novel sRNAs m 
the E, coli genome. 

All known sRNAs are encoded within intergenic (Ig) regions (defined as regions 
25 between ORPs). A file containing all Ig sequences from the E. coli genome (Blattner, F.R. 
et al, 1997 Science 277:1453-1474) was used as a starting pomt for our homology search. 
We arbitrarily chose the LO- to 2.5-Mb region of the 4.6-Mb E. coli genome to test and 
refine our approach and developed the following steps for searching the fixU E. coli 
genome. 

30 All Ig regions of 180 nucleotides (nt) or larger were compared to the NCBI 

Unfinished Microbial Genomes database using the BLAST program (Altschul, S.F. et al 
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1990 J Mol Biol 215:403-410). These 1097 Ig regions were rated based on the degree of 
conservation and length of the conserved region when compared to the closely related 
Salmonella and Klebsiella pneumoniae species. The highest rating was given to Ig regions 
with a high degree of conservation (raw BLAST score of >80) over at least 80 nt (see below 
5 for e3q)lanation of ratings). Note that most promoters do not meet these length and 
conservation requirements. Figure 1 shows a set of BLAST searches for three known 
sRNAs (RprA RNA, CsrB RNA, OxyS RNA), three Ig regions with high conservation 
(#14, #17, #52) and one Ig region with intermediate conservation (#36). Some Ig regions 
had a large number of matches, often to several chromosomal regions of the same 

10 organism. These Ig regions were noted and many were found to contain tRNAs, rRNAs, 
REP, or other repeated sequCTices. The 40 highly conserved Ig regions containing tRNAs 
and/or rRNAs were eliminated from our search, as these regions were complicated in their 
patterns of conservation. 

Next the orientation and identity of the ORFs bordering the Ig regions were 

15 deteraiined using the Colibri database, an annotated listing of all coli genes and their 
coordinates. Inconsistencies between the Colibri database and our original file led to the 
reclassification of some Ig regions as shorter than 180 nt, and these were not analyzed 
further. Of the remaining 1006 Ig regions, 13 contained known small RNAs, 295 were in 
the highest conservation groiQ), 88 showed intermediate conservation, and 610 showed no 

20 conservation. 

The location of the conservation relative to the orientation of the flanking ORFs was 
an important consideration in choosing candidates for further analysis. In many cases 
(132/295 Ig regions), the conserved region was just upstream of the start of an ORF, 
consistent with conservation of regulatory regions, including untranslated leaders. Cases 
25 where the conserved region was >50 nt from an ORF start or ext^ded over more than 150 
nt in length (RprA RNA, CsrB RNA, OxyS RNA, #17, and #52 m Fig. 1), or where the 
bordering ORFs ended rather than started at flie Ig region (#14 in Fig. 1), were considered 
bettCT candidates for novel sRNAs. 

Published information on promoters and other known regulatory sites within 
30 conserved regions of promising candidates was tabulated and used to eliminate many 
candidates in which the conservation could be attributed to previously identified promoter 
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or 5' untranslated leaders. Finally, the remaining candidate regions were examined for 
sequence elements such as potential promoters, temiinators, and inverted repeat regions. 
We considered evidence for possible stem-loops, in particiilar those with characteristics of 
rho-ind^endent terminators, as especially indicative of possible sRNA genes (Table 1). 
5 Using these criteria, together with microarray expression data (see below), a set of 

59 candidates was selected (Table 1), Candidates 1-18 were chosen in the first round of 
screening of the 1.0- to 2.5-Mb region; some of these candidates would not have met the 
higher criteria applied to the rest of the genome. 
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^Candidate numbers. #23 was not analyzed; the region of conservation corresponds to a 
published leader sequence. Candidate #61 was added because it is homologous to 
candidate #43 and the dupUcated regions within #55 (see Text and Table 2). 

5 ^Orientation of flanking genes. > and < denote genes present on the clockwise (Watson) or 
counterclockwise (Crick) strand of the E. coli chromosome, respectively. 

^'Criteria used for selection of candidates: C, conservation; C*, long conservation; (#), 
conservation score. Ig regions were assigned scores on the basis of BLAST searches (see 

10 text below). #4 and #32 were rerated from 4 (conserved) to 0 on reanalysis of the endpoints 
of the flanking ORF (#4) and information on an OKF within the Ig region (#32). L, 
location of conservation either far from 5' end of flanking gene or near 3* end of gene; S, 
signal detected in microarray experiments; S*, microarray signal on opposite strand to 
flanking genes; I, inverted repeat; P, predicted promoter; T, predicted terminator; D, 

15 duplicated gene. 

**Detection on high-density oligonucleotide probe arrays. > <, orientation of signal as in Z>. 
Rif, signals present after 20 min treatment with rifampicin. 

20 TSTorthem analysis of RNA extracted from MG1655 cells grown in three conditions (LB 
medium, exponential phase; minimal medium, exponential phase; LB medium, stationary 
phase). Strand specific probes were used for sRNA and mRNAs encoding novel ORFs 
(orientation noted < or > as in by, double stranded DNA probes were used for the rest. For 
#43, bands were originally detected witii a double stranded probe, but appear to be from 

25 homologs (see text). Large, >400 nt. 

Interpretation of high conservation was based on microarray and Northern analyses as well 
as hterature. mRNAs, small RNA transcripts predicted to encode new polypeptides (see 
text). ^Tmown leaders", literature references supported the existence of leaders 

30 corresponding to conservation. For # 37, conservation is consistent with the leader of the 
arcA gene (Compau, I. and Touati, D. 1994 Mol Microbiol 11:955-964). The ORF noted 
for #56 is described in Seoane, A.S. and Levy, S.B. 1995 J Bacteriol 177:530-535; and 
Bouvier, L et al 1992 J Bacteriol 174:5265-5271; see GenBank entry BAAl 6347.1. The 
IS sequence fragment in the conserved region of #48 is homologous to that described by 

35 McVeigh, A. et al 2000 Infect Immun 68:5710-5715. "leaders", a large band on Northern 
analysis, coupled with conservation near the 5' end of an ORF. "promoter/leader?", 
absence of KNA signal, coupled with conservation near the 5' ©ad of a gene, 
"leader/promoter?", RNA signal from microarray or Northem analyses suggested a leader, 
while the conservation is far from the expected position of a leader, "leader or operon", 

40 (for #29) microarray analysis suggested a continuous transcript throughout Ig, "predicted 
sRNAs", (for #8 and #43) Igs contain the hallmarks expected for an sKNA, but RNA 
transcripts were not detected. Igs encoding sRNAs also may include leaders; this is not 
included in the conclusion colmnn. 
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Selecting Candidate Genes by Whole Genome Expression Analysis 

In an independent series of experiments, high-density oKgonucleotide probe anrays 
were used to detect transcripts that might correspond to sRNAs from Ig regions. Total 
RNA isolated from MG1655 cells grown to late exponential phase in LB medium was 
5 labeled for probes or used to generate cDNA probes (see text below). From a single RNA 
isolation each labeling approach was carried out in duplicate and individually hybridized to 
high-density oligonucleotide microarrays. The high-density oligonucleotide probe arrays 
used are appropriate for this analysis as they have probes specific for both the clockwise 
(Watson) and counterclockwise (Crick) strands of each Ig region as well as for the sense 

10 strand of each ORP, The resulting data from the four experiments were analyzed to 
examine global expression within Ig regions, as well as neighboring ORPs. 

Our criteria for analyzing the microarray data evolved during the course of this 
analysis. Stringent criteria (longer transcripts in the Ig region, higher expression levels) 
identified many of the previously known sRNAs but did not uncover many strong 

15 candidates for new small RNAs. More relaxed criteria (shorter transcripts, lower 
expression levels) gave a very large number of candidates and therefore were not by 
themselves useful as the initial basis for identifying candidates. However, this data was 
very useful as an additional criterion for selection of candidate regions based on the 
conservation ^proach. Detection of a transcript by microarray on the strand opposite to 

20 that of surrounding ORFs was considered a strong indicator of an sRNA (S* in Table 1), 
Microarray data contributed to the selection of 34 of 59 candidates (Table 1). Examples of 
the different types of expression observed in microarray experiments are shown in Figure 2. 
Signal coiresponding to CsrB RNA clearly is detected on the Crick (C) strand. #17 and 
#36 have a transcript in the Ig region on the opposing strand (C) to that for the flanking 

25 genes (Watson; W). However, the expression pattems were not as obvious in many cases, 
either because expression levels were low or because the pattem of expression could be 
interpreted in a number of ways. For instance, very little expression was detected for RprA 
RNA encoded on the W strand, and there is unexplained signal detected from the opposite 
strand of the rprA and csrB Ig regions. #14 and #52 also had some expression on each 

30 strand (Figure 2). #14 proved to express a small RNA from the Watson strand, while #52 
expresses sRNAs from each strand (see below and Table 2). 
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Given that a number of flie known sRNAs are relatively stable, we tested whether 
selection for stable RNAs might allow the microarray data to be more useful for de novo 
identification of sRNA candidates. The transcription inhibitor rifampicin was added to 
cells for 20 min prior to harvesting the RNA with the intention of enriching for stable 
5 RNAs. Many of the known sRNAs can be detected after the rifampicin treatment. Of the 
59 candidates in Table 1 twelve retained a hybridization signal (marked rif in Table 1), and 
four of these proved to correspond to small transcripts (see below). Other rif resistant 
transcripts detected in Ig regions appeared to be due to highly expressed leaders. 
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""Table is divided into three sections: detected sRNAs, predicted sRNAs and detected 
RNAs predicted to encode small ORFs. 

bAdjysj^ sizes estimated from Northern analyses using ^single stranded RNA probes or 
^'oligonucleotide probes, or ^om predictions resulting from sequence analysis (see text) 

5 *>< denotes orientation of sRNA and flanking genes as in Table L 

Relative expression in three growth conditions: E, LB medium, exponential phase; M, 
minimal medium, exponential phase; and S, LB medium, stationary phase. 

^RNA coimmxmoprecipitation with Hfq as detected by Northern analysis: +, strong binding 
(>30% of RNA boimd); +/-, weak binding (5-10%); -/+, minimal binding (<5%), and no 
10 detectable binding. E, M, S refer to cell growth conditions as in/. NT, not tested. 

^'Expression of rpoS-lacZ fusion in the presence of multicopy plasmids carrying intergenic 
regions. Activity was measured in stationary phase in LB medium (S) or minimal medium 
(M) and normalized to the activity of the vector control in the same experiment In parallel 
experiments, cells carrying the vector alone gave 1.3-2 (S) and 0.7-2.6 (M) imits, cells 
15 carrying pRS-DsrA plasmid gave a 4.9 fold increase (S) and 12 fold increase (M)\ cells 
carrying pRS-RprA plasmid gave 3.1 fold (S) and 3.3 fold (M) increase. Results in table 
are average of at least three independent assays. Values in bold were considered 
significantly different fi*om the control. NT, not tested. 

^#41 and #52 each express two sRNAs so it is not possible to assign a phenotype to a given 
20 small RNA. Thus far there is no evidence for a strong phenotype for either candidate. 

^Included is information about additional RNA bands detected in Northern analysis. 

Small RNA Transcripts Detected bv Nortfaem Hybridization 

The final test for the presence of an sRNA gene was the direct detection of a small 
RNA transcript. The candidates in Table 1 were analyzed by Northern hybridization using 

25 RNA extracted fi"om MG1655 cells harvested firom three growth conditions (exponential 
phase in LB medium, exponential phase in M63-glucose medium, or stationary phase in LB 
medium). The microairay analysis discussed above used RNA isolated fi"om cells grown to 
late exponential phase in LB medium, which is intermediate between the two LB growth 
conditions used for the Northern analysis. Initially, Northem analysis was carried out using 

30 double-stranded DNA probes containing the fiiU Ig region for most candidates. In three 
cases (#8, #22, and #55) PGR amplification of the Ig region to generate a probe was not 
successful and therefore oligonucleotide probes were used for Northem analysis. 
Seventeen candidates gave distinct bands consistent with small RNAs, and one additional 
candidate gave a somewhat larger RNA, but the location of conservation was not consistent 

35 with a leader sequence for a flanking ORF (#36). In some of these cases, two or more RNA 

species were detected with a single Ig probe (Table 2, see also Fig. 3). One candidate (#43) 
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gave a signal with flie double stranded DNA probe, but contains regions duplicated 
elsewhere in E. coli that probably account for this signal (see below). Of the remaining 41 
candidates, 17 gave no detectable transcript. These Ig regions could encode sRNAs 
expressed only under very specific growth conditions. For instance, #8 has all the sequence 
5 hallmarks of an sRNA gene (a well-conserved region preceded by a possible promoter and 
ending with a terminator), but has not been detected. Alternatively, the observed 
conservation could be due to nontranscribed regulatory regions. Fairly large RNAs were 
detected for another 24 candidates. Given the size of these transcripts togettier with data on 
the orientation of flanking genes and the location of conserved regions, it is likely these are 

10 leader sequences within mRNAs (Table 1). 

For candidates expressing RNAs not expected to be 5' untranslated leaders. 
Northern analysis was carried out with strand-specific probes to determine gene orientation 
(Fig. 3). For many of the candidates, we used sequence elements (see below) as well as 
expression information firom the microarray experiments to predict which strand was most 

15 likely expressed; both strands were tested when predictions were imclear. The results from 
the strand-specific probes generally agreed with predictions and were used to estimate the 
RNA size (Table 2). Interestingly, in one case there is an sRNA expressed fi-om both the W 
and C strand within the Ig (#52; Fig. 3). For #12, alfliough no sKNA had been detected 
using a double stranded DNA probe, the presence of a potential tOTuinator and promoter 

20 remained suggestive of the presence of an sRNA gene. Therefore, oUgonucleotide probes 
also were used in Northern analysis of this candidate, and a small RNA transcript was 
detected (Fig. 3; Table 1). 

Examination of expression profiles of the KNAs imder different growth conditions 
gave an indication of specificity of expression. Some candidates were detected under all 

25 three growth conditions; others were preferentially expressed under one growth condition 
(Fig. 3; Table 2). For instance, #25 was present primarily during growth in minimal 
medium, consistent with the absence of detection in the whole genome expression 
experiment, which analyzed RNA isolated from cells grown in rich medium. 
Sequence Predictions of sRNA Genes and ORFs 

30 For the candidates expressing small RNA transcripts, the conserved sequence 

blocks (contigs) from K. pneumoniae, the highest conserved Salmonella species, and in a 
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few cases Yersinia pestis, were selected JBrom the NCBI Unfinished Microbial Genome 
database and aligned with the E. coli Ig region using GCG Grap (Devereux, J. et aL 1984 
Nucleic Acids Res 12:387-395). Multiple alignments were assembled by hand, and the 
conserved regions were examined for likely promoters and terminators and other conserved 
5 structures. Information from the alignments, togethCT with results from strand-specific 
Northern and microarray egression analyses, allowed assignments of gene orientation, 
putative regulatory regions, and RNA length from the predicted starting and ending 
positions. Where a terminator sequence was very apparent (13 of 19 candidates), 
transcription was assumed to end at the terminator, and the observed size of the transcript 

10 was used to help identify possible promoters. The identification of promoters and 
terminators was less definite when th^e was only one species with conservation to E, coli. 

As the alignments were assembled, the pattern of conservation in some cases was 
reminiscent of patterns expected from ORFs, with higher sequence variation in positions 
consistent with the third nucleotide of codons. GCG Map (Devereux, J. et aL 1984 Nucleic 

15 Acids Res 12:387-395) was used to predict translation in all fi-ames for all of the candidate 
small RNAs. In six cases, the conservation and translation potential suggested the presence 
of a short ORF. In these cases, a ribosome-binding site and the potential ORF were well 
conserved, with the most variation in the third position of codons, but other elements of the 
predicted RNA were less well conserved. For example, #17 expresses an KNA of about 

20 266 nt, containing a predicted ORF of only 19 amino acids. Within the predicted Shine- 
Delgamo sequence and ORF, only 9/80 positions showed variation for either Klebsiella or 
Salmonella, while the overall RNA is less than 60% conserved. We predict that for #17, as 
well as five others (Table 2), the detected RNA transcript is fimctioning as an mRNA, 
encoding a short, conserved ORF. An evaluation of both the new predicted ORFs and the 

25 untranslated sRNAs with GLIMMER, a program designed to predict ORFs within 
genomes, gave complete agreement with our designations (Delcher, A.L. et aL 1999 
Nucleic Acids Res 27:4636-4641). 

We have assigned gene names to aU candidates that we have confirmed are 
expressed as RNAs (see Table 2). The genes we predict to encode ORFs were given names 

30 according to accepted practice for ORFs (Rudd, K.E. 1998 Microbiol Mol Biol Rev 62:985- 
1019). The genes that express sRNAs without evidence of conserved ORFs were named 
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with a similar nomenclature: r^, with ry denoting RNA and jc indicating the 10 min 
interval on the E. coli genetic map. 

We noted one instance of overlap in sequence between our new sRNAs. The 
conserved region within #43 is highly homologous to a duplicated region within #55, as 
5 well as to a fourth region of the chromosome within a more poorly conserved Ig (#61 in 
Table 1). This repeated region was previously denoted the QUAD repeat and suggested to 
encode sRNAs (Rudd, K.E. 1999 Res Microbiol 150:653-664). Each of the QUAD repeats 
contains a short stretch homologous to boxC, a repeat element of unknown function present 
in 50 copies or more witMn the genome of E. coli (Bachellier, S. et al 1996 Repeated 

10 Sequences In: Escherichia coli and Salmonella: Cellular and Molecular Biology eds. F.C. 
Neidhardt, et aL pp. 2012-2040 American Society for Microbiology, Washington, D.C.). 
Rudd also has detected transcripts from the QUAD regions. To determine which of the 
four QUAD genes was being expressed, we designed oligonucleotide probes unique for 
each of the four repeats. These oligonucleotide probes demonstrated expression for three of 

15 the four QUAD genes (#55-1, #55-11, and #61); furthermore, each gave two RNA bands 
(Fig. 3; Table 2). No signal was detected for the fourth repeat (#43). The #41 Ig region 
encodes another pair of repeats, PAIR2 (Rudd, K.E. 1999 Res Microbiol 150:653-664), and 
we observed two RNA species, suggesting that each of the repeats may be transcriptionally 
active. Finally, another repeat region noted by Rudd, PAIR3, is encoded by the #22 Ig 

20 region. 

Many sRNAs Bind Hfq and Modulate rpoS Expression 

Hfq is a small, highly abxmdant RNA-binding protein first identified for its role in 
replication of the RNA phage Qp (Franze de Fernandez, M. et al 1968 Nature 219:588- 
590; reviewed in Blumentiial, T. and Carmichael, G.G. 1979 Annu Rev Biochem 48:525- 

25 548). Recently, Hfq has been shovm to be involved in a number of RNA transactions in the 
cell, including translational regtdation (rpoS), mRNA polyadenylation, and mRNA stability 
(ofnpA, mutS, and mioA) (Muffler, A. et aL 1996 Genes & Dev 10:1143-1151; Tsui, H.- 
C.T. et aL 1997 J Bacteriol 179:7476^7487; Vytvytska, O. et aL 1998 PNAS USA 
95:141 18-14123; Hajndsorf, E. and Regnier, P. 2000 PNAS USA 97:1501-1505; Vytvytska, 

30 O. et aL 2000 Genes & Dev 14:1109-1118). Three of the known E. coli sRNAs regulate 
rpoS expression: DsrA RNA and RprA RNA positively regulate i-poS translation, whereas 
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OxyS RNA represses its translation. In all three cases the Hfq protein is required for 
regxilation (Zhang, A. et al 1998 EMBO J 17:6061-6068; Majdalani, N. et al 2001 Mol 
Microbiol 39:1382-1394; Sledjeski, D.D. et al 2001 J Bacteriol 183:1997-2005), and 
binding studies have revealed a direct interaction between Hfq and the OxyS and DsrA 
5 RNAs (Zhang, A. et al. 1998 EMBO J 17:6061-6068; Sledjeski, p.D. et al. 2001 J 
Bacteriol 183:1997-2005). 

Given the interaction of the Hfq protein with at least three of the known sRNAs, we 
asked how many of the newly discovered sRNAs are bound by this protein. Hfq-specific 
antisera was used to immunoprecipitate Hfq-associated RNAs from extracts of cells grown 

10 under the conditions used for the Northern analysis. Total immvmoprecipitated RNA was 
examined using two methods. First, RNA was 3'-end labeled and selected RNAs were 
visualized directly on polyacrylamide gels. Under each growth condition, several RNA 
species co-immunoprecipitated with Hfq-specific sera but not with preimmune sera, which 
indicates that many sRNAs interact with Hfq (Fig. 4A). Second, selected RNAs were 

15 examined using Northern hybridization to determine whether other known sRNAs and any 
of our newly discovered sRNAs interact with Hfq. For each sRNA, Hfq binding was 
examined under growth conditions where the sRNA was most abundant (Fig. 4B; Table 2). 
sRNAs present in samples using the Hfq antisera but not preimmune sera were concluded 
to interact vsdth Hfq. Comparison of levels of a selected sRNA relative to the total amount 

20 of that sRNA in the extract revealed that many of the sRNAs boimd Hfq quite efficiently 
(>30% bound) (#14, #24, #25, #26, #31, #41, #52-n, Spot42 RNA, and RprA RNA), but 
other sRNAs bound Hfq less efSciently (<10% bound) (#9, #17, and #52-1), or not at all 
(#27, #38, #40, 6S RNA, 5S RNA, and tmRNA) (Fig. 4; Table 2). 

As mentioned above, at least three of the known sRNAs that interact with Hfq also 

25 regulate translation of i-poS, the stationary phase sigma factor. In light of the fact that many 
of the new sRNAs also interact with Hfq, we examined whether these new sRNAs affect 
rpoS expression. Plasmids carrying the Ig regions encoding either control sRNAs (pRS- 
DsrA and pRS-RprA) or many of our novel sRNAs were introduced into an MG1655 Alac 
derivative carrying a rpoS-lacZ translational fusion. We then compared expression of the 

30 rpoS'lacZ fusion in these cells to cells carrying the control vector by measuring P- 
galactosidase activity at stationary phase in LB or M63-glucose medium (Table 2), As 
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expected, overproduction of either DsrA RNA or RprA RNA increased rpoS-lacZ 
expression significantly (Table 2 legend). A number of plasmids (pRS-#24, pRS-#31) led 
to increased rpoS4acZ expression, whereas others (pRS-#12, pRS-#14, and pRS-#25) led 
to decreased expression. These results indicate that the corresponding sRNAs may directly 
5 regulate rpoS expression or indirectly affect rpoS expression by altering Hfq activity, 
possibly by competition. Intriguingly, there is not a complete correlation between Hfq 
binding and altered rpoS-lacZ expression in these studies. 

As another strategy in defining possible fimctions for the sRNAs, we screened 
strains carrying the multicopy plasmids for effects on growth in LB medium at various 
10 temperatures as well as growth in minimal medium containing a number of different carbon 
sources. pRS-#25 renders cells unable to grow on succinate in agreement with predictions 
for #25 RNA interaction with sdh mRNA (discussed below). We were unable to isolate 
plasmids carrying the #27 Ig region without mutations, indicating that overproduction of 
this small RNA may interfere with growth. No other growth phenotypes were observed. A 
15 caveat for the interpretation of results with the multicopy plasmids is that they contain the 
full intergenic region, therefore we cannot rule out effects of sequences outside the sRNA 
genes but within the intergenic regions. 

In summary, a multifaceted search strategy to predict sRNA genes was validated by 
our discovery of 17 novel sRNAs. Northern analysis determined that 44 of 60 candidate 
20 regions express RNA transcripts, some of them expressing more than one RNA species. Of 
these transcripts, 24 were concluded to be 5' untranslated leaders for mRNAs of flanking 
genes, and another six are predicted to encode new, short ORFs (Tables 1 and 2), The 17 
transcripts believed to be novel, functional sRNAs range firom 45 nt to 320 nt in length and 
vary significantly in expression levels and expression profiles under different growth 
25 conditions. More than half of the new sRNAs were found to interact with the RNA-binding 
protein Hfq, indicating that Hfq binding may be a defining characteristic of a family of 
prokaryotic sRNAs. 
Evaluation of Selection Criteria 

Three general approaches for predicting sRNA genes were evaluated in this work. 
30 In the primary approach, Ig regions were scored for degree and length of conservation 
between closely related bacterial species followed by examination of sequence features. 
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This approach proved to be very productive ia identifying Ig regions encoding novel 
sRNAs in E. coli; more than 30% of the candidates selected primarily on the basis of their 
conservation proved to encode novel small transcripts. The availability of nearly 
completed genome sequences for Salmonella and Klebsiella made this approach possible. 
5 Any organism for which the genome sequences of closely related species are known can be 
analyzed in this way. Comparative genomics of this sort have been used before to search 
for regulatory sites (for review, see Gelfand, M.S. 1999 Res Microbiol 150:755-771), but 
have not been employed previously to find sRNAs. 

Although we found the conservation-based approach to be the most productive in 

10 identifying sRNA genes, we note a number of limitations to its use. A high level of 
conservation is not sufficient to indicate the presence of an sRNA gene. Many of the most 
highly conserved regions, not unexpectedly, were consistent with regulatory and leader 
sequences for flanking genes. We also did not analyze any Ig regions where conservation 
was attributable to sources other than an sRNA. For example, potential sRNAs processed 

15 fi-om mKNAs, or any sRNAs encoded by the antisense strand of ORFs or leaders, may have 
been missed in our approach. We made the assmnption that Ig regions must be >180 nt to 
encode an sRNA of >60 nt, a 50-60-nt promoter and regulatory region to control expression 
of the sRNA, as well as regulatory regions for flanking genes. Any sRNA genes in smaller 
Ig regions would have been overlooked. We also excluded the highly conserved tRNA and 

20 rRNA operons firom our consideration because of their complexity. It is certainly possible 
that sRNA genes may be associated with these other RNA genes. In fact, sRNA genes have 
been predicted to be encoded in at least one tRNA operon. In addition, conservation need 
not be a property of all sRNAs. We expect sRNAs that play a role in modulating cellular 
metabolism to be well conserved, as is the case for the previously identified sRNAs. 

25 Nevertheless, sRNAs may be encoded within or act upon regions for which there is no 
hohaology between E. coli, Klebsiella^ and Salmonella (e.g., in cryptic prophages and 
pathogenicity islands), and they would be missed by this approach. Only one of 24 Ig 
regions within the el4, CP4-54, or CP4-6 prophages showed conservation. A few of these 
Ig regions showed evidence of transcription by microairay analysis, and RNAs have been 

30 imphcated in immunity regulation in phage P4 (Ghisotti, D. et ah 1992 Mol Microbiol 
6:3405-3413), which is related to the prophages CP4-54 and CP4-6. Despite the limitations 
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listed above, however, we believe the use of conservation provides a relatively quick 
identification of the majority of sRNAs. 

An alternative genomic sequence-based strategy for identifying sRNAs would be to 
search for oiphau promotCT and terminator elements as well as other potential RNA 
5 structural elements. Potential promoter elements were generally too abundant to be useful 
predictors without other information on their expected location and orientation. We found 
sequences predicted to be rho-independent terminators a more useful indicator of sKNAs; 
such sequences were clearly present for 13/17 of the sRNAs and 3/6 of the new mRNAs. 
hi a number of cases, it appears that the sRNAs share a terminator with a convergent gene 

10 for an ORF, In other cases, either no terminator was detected or it appeared to be in a 
neighboring ORF. A search usmg promoter and terminator sequences as the requirements 
for identifying sRNAs might therefore have found two-thirds of the sRNAs described here. 
Phage integration target sequences also could be scanned for nearby sRNA genes. Many 
phage att sites overlap tRNAs (reviewed in Campbell, A.M. 1992 J BacteHol 174: 7495- 

15 7499), and ssrA, encoding the tmRNA, has a 3' structure like a tRNA and overlaps the att 
site of a cryptic prophage (Kirby, J.E. et al. 1994 J Bacterial 176:2068-2081). In this work, 
we foimd that flie 3' end and terminator of #14 overlaps the previously mapped phage P2 
aU site (Baireho, V. and Haggard-Ljungquist, E. 1992 J Bacterial 174:4086-4093). #14 
sRNA does not obviously resemble a tRNA, suggesting that flie overlap between phage att 

20 sites and RNA genes extends beyond tRNAs and related molecules and may be common to 
additional sRNAs. 

Our second approach, high-density oligonucleotide probe array expression analysis, 
proved to be more useful in confirming the presence of sRNA genes first foimd by the 
conservation approach than in identifying new sRNA genes de novo. Further consideration 

25 of the location of microarray signal compared to flanking genes as well as analysis of 
microarray signals after a variety of growth conditions should expand the ability to detect 
sRNAs in this manner. Under a single groAvth condition, signal consistent with the RNA 
identified by Northern analysis was detected for 5/15 of the Ig regions proven to encode 
new sRNAs and for 4/6 of the new mRNAs. Thus, a similar analysis of microarray data in 

30 nonconserved genomic regions roight help in the identification of sRNAs missed by the 
conservation-based approaches. We predict that sRNAs firom any organism expressed at 
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reasonably high levels under normal growth conditions will be detected by microarrays that 
interrogate the entire genome, inclusive of noncoding regions. 

One clear limitation in detecting sRNAs with microarray or Northern analyses is the 
fact that some sRNAs may be expressed only under limited growth conditions or at 
5 extremely low levels. We chose three growth conditions to scan our samples. While most 
of the previously known sRNAs were seen under these conditions, OxyS KNA, which is 
induced by oxidative stress, was not detectable. For a few of our candidates in which no 
RNA was detected, it is possible that an sRNA is encoded but is not expressed sufficiently 
to be detected under any of our growth conditions. Another possible limitation of 

10 hybridization-based approaches is that highly structured sRNAs may be refractory to probe 
generation. sRNA transcripts may not remain quantitatively represented after the 
fragmentation used in the direct labeling approach here. cDNA labeling also may 
underrepresent sRNAs because they are a small, target for the oligonucleotide primers, and 
secondary structure can interfere with efficiency of extension. 

15 As our tiiird approach, sRNAs were selected on the basis of their ability to bind to 

the general KNA binding protein, Hfq. Northern analysis revealed that many of our novel 
sRNAs interact with Hfq. In preliminary microarray analysis of Hfq-selected RNAs to look 
for additional unknown sRNAs, DsrA RNA, DicF RNA, Spot42 RNA, #14, #24, #25, #31, 
#41, and #52-11 were detected among those RNAs with the largest difference in levels 

20 between Hfq-specific sera and pre-immune sera. This preliminary experiment suggests that 
microarray analysis of selected RNAs will be very valuable on a genome-wide basis. 
Literestingly, a large number of genes with leaders and a number of RNAs for operons were 
found to co-immunoprecipitate wilb Hfq (including the known Hfq target nlpD-rpoS 
mRNA (Brown, L. and EUiott, T. 1996 JBacteriol 178:3763-3770). It seems Hkely that the 

25 subset of sRNAs binding a common protein will represent a subset in terms of fimction; the 
sRNAs of known fimction associated with Hfq in our experiments appear to be those 
involved in regulating mRNA translation and stability. Other sRNAs have been shown to 
interact with specific prokaryotic RNA-binding proteins, for example, tmRNA with SmpB 
(Karzai, A.W. et al 1999 EMBO J 18:3793-3799), and the possibiUty of other sRNAs 

30 interacting with these proteins or other general sRNA-binding proteins should be tested. 
This approach is ad^table to all organisms, and, in fact, binding to Sm and Fibrillarin 
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proteins has been the basis for identiJBcation of several sRNAs in eukaiyotic cells 
(Montzka, K.A. and Steitz, J.A. 1988 PNAS USA 85:8885-8889; Tyc, KL and Steitz, JA. 
1989 EMBO y 8:3 1 1 3-3 1 1 9). 

All the criteria we used to identify sRNAs also will detect short genes encoding new 
5 small peptides, and we have foxmd six conserved short ORFs. Although our approach was 
intended to develop methods to identify non-translated genes within the genome, short 
ORFs also are missing from annotated genome sequences. The combination of a 
requirement for conservation and/or transcription with sequence predictions for ORFs 
should add significantly to our ability to recognize short ORFs. Small polypeptides have 

10 been shown to have a variety of interesting cellular loles. We expect that the short ORFs 
we have found are involved ia signaling pathways, aMn to those of B. subtilis peptides that 
enter the medium and cany out cell-cell signaling (reviewed ia Lazazzera, B.A. 2000 Curr 
Opin Microbiol 3:177-182). 
Characteristics and Functions of New sRNAs 

15 The current work serves as a blueprint for the prediction, detection, and 

characterization of a large group of novel sRNAs. We have definitive information on 
characteristics that provide information regarding the cellular roles of these new sRNAs. 
Several known sRNAs that bind the Hfq protein act via base pairing to target mRNAs. The 
finding that a number of our new sRNAs bind Hfq indicates a similar mechanism of action 

20 for this subset of sRNAs. We searched the E, coli genome for possible complementary 
target sequences and examined phenotypes associated with multicopy plasmids containing 
new sKNA genes. Intriguingly, #25, an sRNA preferentially expressed in minimal 
medium, has extended complementarity to a sequence near the start of sdltD^ the second 
gene of the succinate dehydrogenase operon. When the #25 Ig region is present on a 

25 multicopy plasmid, it interferes with growth on succinate rninimal medium (Table 2), 
consistent with #25 sRNA acting as an antisense RNA for sdhD. Complementarity to many 
target mRNAs was found for a number of other novel . sRNAs, confirming the vaUdity of 
this analysis. 

As outlined in the evaluation of each of our approaches, we do not expect our 
30 searches have been exhaustive. sRNAs also have been detected by others using a variety of 
approaches. The sRNA encoded by #38 was independently identified as a regulatory RNA 
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(CsrC RNA; T. Romeo, pers. comm.), and others have found additional sRNAs using 
variations of the approaches used here (Argaman, L. et al, 2001. Cwrr. BioL in press). 
Nevertheless, we thiok it xmlikely that there are many more than 50 sRNAs encoded by the 
E. coli chromosome and by closely related bacteria. We expect such sRNAs to be present 
5 and playing important regulatory roles in all organisms. Using the approaches described 
here, it is feasible to search all sequenced organisms for these important regulatory 
molecules. We anticipate that study of the expanded list of sRNAs in E, coli will allow a 
more complete understanding of the range of roles played by regulatory sRNAs. 

EXAMPLE 1 

10 Computer Searches 

Ig regions are defined here as sequences between two neighboring ORFs. We 
compared Ig regions of >180 nt against the NCBI Unfinished Microbial Genomes database 
(http://www.ncbi.iilm.nih.gov/MicrobJblast/unfinishedgenom using the BLAST 

program (Altschul, S.F. et al. 1990 J Mol Biol 215:403-410). Salmonella enteritidis 

15 sequence data were firom the University of Illrnois, Department of Microbiology 
(http://www.sahnonella.org). Salmonella typhi and Yersinia pestis sequence data were 
from the Sanger Centre (http://www.sanger.ac.xik/Projects/S_typhi/ and 
http ://www. Sanger . ac .uk/Proj ects/Y _j)estis/) . Salmonella typhimurium. Salmonella 
paratyphi^ and Klebsiella pneumoniae sequences were firom the Washington University 

20 Genome Sequencing Center. 

Each Ig region was rated based on the best match to Salmonella or K. pneumoniae 
species. Ig regions containing previously identified sRNAs were rated 5 (each of them met 
the criteria to be rated 4). Ig regions were rated 4 if the raw BLAST score was >200 
(hatched bars in Fig. 1) or 80-200 (double-diagonal bars in Fig 1) extending for >80 nt; 3 if 

25 the raw BLAST score was 80-200 (double-diagonal bar) extending for 60-80 nt; 2 if the 
raw BLAST score was 50-80 (diagonal bar) extending for >65 nt; and 1 if the raw BLAST 
score was <5G (diagonal-dash, solid or none) or <65 nt. The location of the longest 
conserved section(s) within each Ig and the number of matches to tiie NCBI Unfinished 
Microbial database were recorded. Note that the computer searches were done from May 

30 2000 to December 2000; more sequences are e?q>ected to match as the database continues 
to expand. The identity and orientation of graes flanking each Ig region were determined 
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from the Colibri database (http://genolist.pasteur.fr/Colibii). Ig regions that the Colibri 
database predicted to be <180 nt in length and Ig regions containing tRNA and/or rRNAs 
were rated 0 and removed from further consideration. 
Strains and Plasmids 

5 Strains were grown at 37°C in Luria-Bertani (LB) medium or M63 minimal 

medium supplemented with 0.2% glucose and 0.002% vitamin Bl (Silhavy, T.J. et al. 1984 
Experiments with gene fusiojis Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.) 
except for phenotype testing of strains carrying multicopy plasmids as described below. 
Ampicillin (50 pg/ml) was added where appropriate. E. coli MG1655 was the parent for all 

10 strains used in this study. MG1655 Alac (DJ480, obtained from D. Jin, NCI), was 
lysogenized with a X phage canying an rpoS-lacL translational fusion (Sledjeski, D.D. et 
al 1996 EMBO J 15:3993-4000) to create strain SG30013. 

To generate clones containing the Ig region of each candidate (pCR-#N where N 
refers to candidate niunber; see Table 1), Ig regions were ampUfied by PCR from a 

15 MG1655 colony and cloned into the pCRII vector using the TOPO TA cloning kit 
(Invitrogen). Oligonucleotides were designed so the entire conserved region and in most 
cases the fiill Ig region was included. In a few cases, repeated sequences or other 
irregularities required a reduction in the Ig regions cloned. See Table 3 for a list of all 
oligonucleotides used in this paper, Ig regions encoding sRNAs also were cloned into 

20 multicopy expression vectors (pRS-#N) in which each Ig region is flanked by several 
vector-encoded transcription terminators. To generate pRS-#N plasmids, pCR-#N plasmids 
were digested with BamHl and Xhol and the Ig-containing fragments were cloned into the 
BamHi and Sail sites of pRS1553 (Pepe, CM, et al 1997 JMolBiol 270:14-25), replacing 
the /flcZ-a peptide. To construct pBS-spot42, the Spot42-containing firagment was 

25 amplified by PCR from K12 genomic DNA, digested with EcoRL and BamHl and cloned 
into corresponding sites in pBluescript II SK* (Stratagene). AU DNA manipulations were 
carried out using standard procedures. All clones were confirmed by sequencing. 
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Table 3 Oligonttcleotides 



OUgo 
Name 


Sequence 


SEQ 
ID NO 


Candidate 
Nnmber 


KW-39 


GCGCiJrCGTTATCATCCAAAATACG 


32 


#1 


KW-4U 


Cj 1 CCjCCC ACjCCAA 1 GCl 1 ICACr 1 CG 


33 


'X/"Xtr A 1 

KW-41 


ATTGATCGCACACCTGACAGCTGCC 


34 


#2 


KW-42 


G n Gl CACCC 1 GGACCTGGTCGTAC 


35 


KW-43 


TGACCGCGAl 1 IGCACAAAATGC 


36 


#3 


KW-44 


ACTCTTAAAl 1 iCCTATCAAAACrCGC 


37 


KW-45 


GGTAl 1 1 ICAGAGATTATGAATTGCCG 


38 


#4 


KW-46 


TCACCTCTCCTTCGAGCGCTACTGG 


39 


KW-47 


AATGCTCTCCTGATAATGTTAAACTT 


40 


#5 


KW-48 


GGTTAGCTCCGAAGCAAAAGCCGGAT 


41 


KW-49 


TAATTC(J1 i ICAAATGAAACGGAGC 


42 


#6 


KW-50 


GGACTCCCTCATTATAATTACTGG 


43 


KW-51 


CTCCTTAAACAAGGACATTAGTCTACG 


44 


#7 


KW-52 


ATTCACCITACCTAATTTGATTCTTCC 


45 


KW-123 


CCATCGCTTGACGTTGCATTCACCTGC 


46 


#8(probes) 


KW-124 


GTCGGCGTCGTACGAATCAATTGTGC 


47 


KW-125 


GCACAATTGATTCGTACGACGCCGAC 


48 


KW-55 


TAAGGATAATATTGCAGATCGTAAG 


49 


#9 


KW-56 


ATCATCAAACAGCAACTTGCCC 


50 


KW-57 


TGTCCTTCTCCTGCAAGAGAATTATT 


51 


#10 


KW-58 


GCTAATAATAATGTCTmrCGCTCC 


52 


FR-100 


GCTTTTGTGAATTAATTTGTATATCGAAGCG 


53 


#11 


FR-101 


TATTAATACCCTCTAGATTGAGTTAATC 


54 


FR-102 


CGATI'IACCTCACTTCATCGCTIICAG 


55 


#12 


FR-103 


TGATCCTGACTTAATGCCGCAAGTTC 


56 


FR-104 


GCITATCTCCGGCACrCTCAGTGGCTTAGCTCTTGA 
AGG 


57 


(probe) 


FR-105 


TTGCTCACATCTCAul l i AATCGTGCTC 


58 


#13 


FR-106 


ATATTCCACCAGCTAl 1 IGTTAGTGAATAAAAGG 


59 


FR-107 


I GAl l AAl 1 l UGA l l A l 1 1 1 I CGUGGAIGG 


60 


#14 


rR-lOo 


All AGAAAC AGGAAGCCCG 1 GAG i CGAG 


61 


FR-109 


TTAl 1 1 iCCCCGGAAGCACATTCACrTCAC 


62 


#15 




xr^ATr*^ AXTTJP* Ar^A Ar^riAOTiA a rjr^ 


^1 

03 


FR-lll 


TGCTTACTCATCAAAAGTAGCGCCAGATTC 


64 


#16 


FR-112 


TAATCGACGGACGATAGATAATTCCTG 


65 


FR-113 


CCAATGTGTCGCCTTTTTCAACTTTCCG 


66 


#17 


FR-114 


CGATTTATGAGAATAAATACTCATTTAAGGGTG 


67 


FR-115 


AAATCCGACTTTAGTTACAACATAC 


68 


#18 


FR-116 


GACCAGACCTTCTTGATGATGGGCAC 


69 


KW-69 


CGACCTCAATTCCACGGGATCTGG 


70 


#19 


KW-70 


ATTTAGCTGTAGTAATCACTCGCCG 


71 
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Oligo 
Name 


Sequence 


SEQ 
ID NO 


Candidate 
Number 


KW-71 


GGTCTCCTTAGCGCCTTATTGCG 


72 


««yx 

#20 


KW-72 


CGCCCACATGCTGTTCITATTATTCCC 


73 




KW-73 


TTTATGACACCTGCCACTGCCGTC 


yl 

74 


#21 


KW-74 


CTGTCAAGTTATCTGTTTGTTAAGTCAAGC 


75 




KW-126 


GCrGTGAAGCACCTGCGTTGCTCATG 


76 


#22 


KW-127 


GCTGTGAAACACCTGCAl 1 1 ACGGCCACGG 


77 


(probes) 


KW-128 


CCGTGGCCGTAAATGC AGO 1 GTTl CAC AGC 


/O 




KW-77 


y^ ^' VI ^^ '''i^y^ y — < A A 'I * l^y^ A y^yr^y^ AAA A 

CCl 1 ICGCAATTGACTGAAACAC 


79 


#24 


KW-78 


GGCTAGACCGGGGTGCGCG 


orv 

80 




KW-79 


A A y-^ y—nTH^-^y-trr^rm a ff I ■ M A A y~t/ V 1 • 1 • A f\ y^/^ 

AAGGTGGTTATTTACACCTTAGCG 


81 


#25 


KW-80 


GTCCTCl 1 rOGGGTAAATOTC 


82 




KW-81 


AATGCTCCGGl 1 iCATGTCATC 


83 


#25 


KW-82 


TAGTTCCTTCTCACCCGGAG 


O yf 

84 




FR-117 


CACAAGGGCGCl 1 IACjI 1 lOl i i iCCG 


85 


#27 


FR-118 


ATCCCCTGAGAGTTTAAl 1 1 iCGTCAAG 


86 




KW-85 


TAATTCGTCGTAATTCGTCCTCC 


87 


lly^ O 

#28 


KW-86 


CTCTGCCTTCCTGri'i 1 iGTTGTG 


88 




FR-119 


AAA y~iy^y^ A » ■ » i ■ i iy~i A A y v l ty~t»T~»y~ty~i y^ y^/^ I'lni'* 1 » 1 »/~ly^ 

AAACGCATTTGCAACTGTCGGCGCTTTTCC 


89 


#29 


FR-120 


CTTGTTACCTCAAAAAATCACAGTGCTCG 


c\r\ 

90 




FR-121 


GCAGTCGGTGATGCTGGATTTGCCCTG 


91 


#30 


FR-122 


Gil 1 1 1 1 1 ACGGGTAAGCCGCAACGACCATTG 


92 




FR-123 


1 AGl AGAI AAGI I'l l ACiAlAAC 


93 


#31 


FR-124 


TAAAACTGAAGTTGCCCTGAAAATG 


94 




FR-125 


TGATGAGTGGTTCTGCAAGAGG 


95 


#32 


FR-126 


TAAAAGACAGATTACCTGGCCTG 


96 




FR-127 


y~» A y -^f ■ i A y— lyM'r^y^ A A A A on AAA y^ y Vl*l*l> A Ti A rT> A y~iy^ 

CGGACTACCTCAAAATAAAGCTTTATATACG 


97 


#33 


FR-128 


y~^m^^ A i-r^y^ A r~rv A y~^y~vnrTr»y-~i a y ■ ir i'» A A A A A A y"i AAA y"! A y~l y~1 

GTCATGATACCTTGATTAAAAAACAAACAGC 


98 




FR-129 


y~-« ytf I'l A rr* A A my^ y~iy^ y~i a y^ a m A A y"»y*yT'»y '"H 1 "^I'*/^ 

GGCn^ATAATGCGCACATAACCTCTTG 


99 


#34 


FR-130 


AATCTTTTCrrrAll'l 1 1 iGGCTAACGAATAGCC 


100 




FR-131 


y~^»-i-»y~i y~* A A y « i » ■ » i irnr'i "»y— < y— i y-i y^irmyt A y~trn A AAA /■« ■ ""i^T'^"^ 

GTCCAACl 1 1 1 IGGGGTCAGTACAAACTTTG 


101 


#35 


FR-132 


A A rTi A A / — ^ y^ y^y~ty~T I " l • a ' ■ v I i AAA T* A y^ "V 1 »/~^ /""^ 

TAATAACGCCGTTATTAAATAGCCTGCC 


102 




FR-133 


nm A A y~i y— « A A y-irTr<y^/T>y~i y't* ■ » 1 * A y" 1' I 'y~^ y^y^y^yrvTr*/^ 

TAAGCAACGTCTGCTTACTGCCCCTC 


103 


#36 


FR-134 


GTGATGGCl ICl GATAAAGATAAAl 1 l ATAGCC 


104 




FR-135 


rr> A A A y^ y~^ ytnn A A y^ A y» y^ * 

TAACAGGCTAAGAGGGGC 


105 


#37 


FR-136 


A 1 1 w 1 >y~< ^~iy^ A y"v I »y"'vi'»'i'<y' "V I'* i^y~« A T^y"! AAA A A /^/^/^ 

ATTGCCACTCnTCrTGATCAAATAACCG 


10.6 




FR-137 


A A n-«y~^ y~iy~mny-^nr>y^» 1*1 »y~i A rr* A A i 1 * 1 *y~^ AAA » 1 • 1^ A y^'T^/*^ 

AATGCGTCTGTTGATAATTCAAATTAGTC 


107 


#38 


FR-138 


p 1 1 A y^ y~iy^y~»r iwrwi'ifii A • I'* i^yi A y^»T^ A ' 1 * A y^"^ A r I * 1 " 1 /^/~l 

TAGCCGl 1 1 1 ATTCAGTATAGATTTGCG 


108 




KW-89 


GTTCGTCGGTAACCCGTTTCAGC 


109 


11^ y-\ 

#39 


KW-90 


ATGGCTTAAAGAGAGGTGCC 


110 




KW-yl 


CCjTACI 1 1 AAACjQGAGAAl GAC 


111 

111 


44- A n 


KW-92 


GTGCTTCCTCATTATGGTGACG 


112 




KW-93 


GAATGGAGGGAGATTACACG 


113 


#41 


KW-94 


CCTTAGTGGGTAAACGCTTAC 


114 
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Oligo 
Name 


Sequence 


SEQ 

roNO 


Candidate 
Number 


KW-95 


CTTTCAGGCAGCTAAGGAAAG 


115 


#42 


KW-96 


CAATATGTATTATTGATTGAGTAAACGGG 


116 


KW-97 


CCTCTTCCAGGAATAATCXZIC 


117 


#43 


KW-98 


CGGAAAGCGGTTCACAGATC 


118 


KW-132 


CTCGTAAGTTTCGCAGCTTATTA 


119 


#43 (probe) 


KW-99 


TGAAATTCCTGTCCGACAGG 


120 


#44 


kW-lOO 


GCACTACCGCAATGTTATTGC 


121 


KW-101 


GCTTACCCAATAAATAGTTACACG 


122 


#45 


KW-102 


TAAAACCTGTCACAAATCACAAA 


123 


KW-103 


GTGGCCTGCTTCAAACTTTCG 


124 


#46 


KW-104 


GTAAAGTCTAGCCTGGCGGTTCG 


125 


FR-139 


TAATrcrcGTACGCCTGGCAGATATTTTGCC 


126 


#47 


FR-140 


ATCAACCTCAAAAGGGAAATCGGG 


127 


KW-105 


TAACTTGTTGTAAGCCGGATCGG 


128 


#48 


KW-106 


TGAAGCATCTATCGCCGGTTGCG 


129 


KW-107 


GATTAGAAATCCmTGAAAGCGCATTG 


130 


#49 


KW-108 


CTTATTGGGCACCGCAATGG 


131 


KW-109 


CGAACACAATAAAGATTTAATTCAGCC 


132 


#50 


KW-110 


CTGATGCTACTGTGTCAACG 


133 


KW-111 


AATAATCAGACATAGCTTAGGC 


134 


#51 


KW-112 


GCCGrGA'lGG'rniCGCG'lTC 


135 


KW-113 


TA'lTri'CCTCCCGCGCTAAAG 


136 


#52 


KW-1 14 


TTCAGCTGATGACCACCACGCTT 


137 


KW-1 15 


GAGTTGTCAGAGCAGGATGATTC 


138 


#53 


KW-116 


lAlCrGCGCriAlCCl'l'lAlGG 


139 


KW-1 17 


CCTTTACGGTGATAACCX3TCGCG 


140 


#54 


KW-1 18 


CTGACAAGCCTCTCATTCTCTTGTC 


141 


KW-1 19 


GAGAATTATCGAGGTCCGGTATC 


142 


#55 


KW-1 20 


CTACGCGTTAGCGATAGACTGC 


143 


FR-141 


AGGCTTACTAAGAACACCAGGGGGAGGGGAA 


144 


probe for 
55-1 


FR-142 


AGTCATAAGCTTCCCCGCTTACTAAGACTA 


145 


probe for 
55-n 


KW-121 


CCTCAAATCGGCCATAATAACC 


146 


56 


KW-122 


TAAACACCXJTCGTCAGAAATGC 


147 


FR-143 


TAGACTTTTATCCACl'l"iA'l'lGCl'G 


148 


#57 


FR-144 


GTGTGCCTTTCGGCGATATGGCGTG 


149 


FR-145 


CCTTTACGTGGGCGGlGAll'lTGTC 


150 


#58 


ER-146 


TAGCTTTGCKXn-GGATGTTTGCC 


151 


FR-147 


GCTGTAATTTATTCAGCGTTTGTACATACG 


152 


#59 (probe) 


FR-148 


TCAGTCAACTCGCTGCGGCGTGTTAC 


153 


#60 


FR-149 


CTTATTGTTGCTTAGTTAGGGTAGTCAC 


154 


KW-131 


CAGTCAGTCTCAGGGGAGGAGCAATC 


155 


#61 (probe) 
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Oligo 
Name 


Sequence 


SEQ 
ID NO 


Candidate 
Number 


KW-59 


TGAATGCACAATAAAAAAATCCCGACCCTG 


156 


ForDsrA 
Ig region 


KW-60 


AGTCGCGCAGTACTCCTC l l ACCAG 


157 






1 JO 


tor KprA Ig 
region 


KW-64 


TAACATTATCAGCCTGCTGACGGC 


159 


sp42-5'-l 


GGCCGAATTCGTAGGGTACAGAGGTAAG 


160 


for cloning 
pBSspot42 


sp42-3'-l 


GGCCGGATCCGTCATTACTGACTGGGGCGG 


161 



RNA Analysis 

RNA for Northern analysis was isolated directly jBrom ~ 3x10^ cells in exponential 
growth (ODgoo = 0.2-0.4) or stationary phase (overnight growth) as described previously 
(Wassarman, K.M. and Storz, G. 2000 Cell 101:613-623). Five-ng RNA samples were 
5 fractionated on 10% polyacrylamide urea gels and transferred to Hybond N membrane as 
described previously (Wassarman, K.M. and Storz, G. 2000 Cell 101:613-623). For 
Northern analysis of candidate regions, double-stranded DNA probes were generated by 
PGR from a colony of MG1655 cells or from the pCR-#N plasmids with oUgonucleotides 
used for cloning the pCR-#N plasmids. PGR amplification was done with 52°C annealing 

10 for 30 cycles in Ix PGR buffer (1 mM each dATP, dGTP, and dTTP; 2.5 |jM dCTP; 100 
jj-Ci [a^^P] dCTP; 10 ng plasmid; 1 unit taq polymerase) (Perkin Elmer). Probes were 
purified ovCT G-50 microspin columns (Amersham Pharmacia Biotech) prior, to use. 
Northem membranes were prehybridized in a 1:1 mixture of Hybrisol I and Hybrisol n 
(Intergen) at 40°C. DNA probes with 500 p.g sonicated salmon sperm DNA were heated 

15 for 5 min to 95'^C, added to prehybridization solution, and membranes were hybridized 
overnight at 40°C. Membranes were washed by rinsing twice with 4x SSC/0.1% SDS at 
room temperature followed by three washes with 2x SSC/0.1% SDS at 40°C. Northern blot 
analysis using RNA probes was done as described previously (Wassarman, KLM. and 
Steitz, J.A. 1992 Mol Cell Biol 12:1276-1285). RNA probes were generated by in vitro 

20 transcription according to manufacturer protocols (Roche Molecular Biochemicals) from 
pCR-#N plasmids linearized with EcoRV or HinDHl using SP6 RNA polymerase or T7 
RNA polymerase, respectively; pBS-6S (pGS0112; Wassarman, K.M. and Storz, G. 2000 
Cell 101:613-623) or pBS-spot42 were linearized with EcoRL using T3 RNA polymerase; 
pGEM-5S (pG5019; Altuvia, S. et al 1997 Cell 90:43-53) or pGEM-lOSa (Altuvia, S. et 

25 al. 1997 Cell 90:43-53) were hnearized with EcoKl using SP6 RNA polymerase. 
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Oligonucleotide probes were labeled by polynucleotide kinase according to manufacturer 
protocols (New England Biolabs) using [y^^P]ATP (>5000 Ci/mmole; Amersham 
Pharmacia Biotech). For oligonucleotide probes, Northern membranes were prehybridized 
in XJltrahyb (Ambion) at 40°C followed by addition of labeled oligonucleotide probe and 
5 hybridization overnight at 40°C. Membranes were washed twice with 2x SSC/0.1% SDS at 
room temperature followed by two washes with O.lx SSC/0.1% SDS at 40°C for 15 
minutes each. 
Immunoprecipitation 

Immunoprecipitations were carried out using extracts from cells in exponential 

10 growth (ODgoo = 0.2-0.4) or stationary phase (overnight growth) as described previously 
(Wassarman, K.M. and Storz, G. 2000 Cell 101:613-623), using rabbit antisera against the 
Hfq protein or preimmune serum. After immunoprecipitation, RNA was isolated from 
Protein A Sepharose-antibody pellets by extraction with phen61:chloroform:isoamyl 
alcohol (50:50:1) followed by ethanol precipitation. RNA was examined on gels directly 

15 after 3' ©ad labeling or analyzed by Northern hybridization after fractionation on 10% 
polyacrylamide urea gels as described previously (Wassannan, K.M. and Storz, G. 2000 
Ce// 101:613-623). 
rpoS-lacTj Expression 

Effects on rpoS-lacZ expression by multicopy plasmids containing the novel sRNAs 

20 were determined from a single colony of SG30013 transformed with pRS-^, grown for 18 
h in 5 ml of LB-ampicillin medium or M63-ampicillin medium supplemented with 0.2% 
glucose at 37°C. P-galactosidase activity in the ciilture was assayed as described 
previously (Zhou, Y.-N. and Gottesman, S. 1998 J Bacteinol 180:1154-1158). The 
numbers provided in Table 2 were calculated as the ratio between pRS-#N and tiie 

25 pRS 1553 vector control. 
Phenotvpe Testing 

To test carbon source utiUzation or temperature sensitivity associated with the 
multicopy plasmids containing the novel sRNAs, a single colony of MG1655 transformed 
with a given pRS-#N was grown for 6 hours in 5 ml LB-ampicillin medium at 37°C. Then 
30 10 p.1 of serial dilutions (10"^, 10"^, and 10"*) were spotted on M63-ampicillin plates 
containing 0.2% of the carbon source being tested (glucose, arabinose, lactose, glycerol, 
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ribose, or succinate) and grown at 37°C; or on LB plates incubated at room temperature or 
42°C. Plates were analyzed after both 1 and 2 days. Failure to grow in Table 2 indicates an 
efiSciency of plating of <10"^. 
Microarray Analysis 

5 RNA for microarray analysis was isolated using the MasterPure RNA purification 

kit according to the maaufacturer protocols (Epicentre) from MG1655 cells grown to OD^qo 
= 0.8 in LB medium at 37°C. DNA was removed from RNA samples by digestion with 
DNase I for 30 mm at 37°C. Probes for microarray analysis were generated by one of two 
methods: direct labeling of enriched mRNA or generation of labeled cDNA. 

10 To generate direct labeled RNA probes, mRNA enrichment and labeling was done 

as described in the Afifymetrix expression handbook (Affymetrix). Oligonucleotide primers 
complementary to 16S and 23S rRNA were annealed to total RNA followed by reverse 
transcription to sj^thesize cDNA strands complementary to 16S and 23S rRNA species. 
16S and 23S were degraded with RNase H followed by DNase I treatment to remove cDNA 

15 and oligonucleotides. Enriched RNA was fragmented for 30 min at 95°C in Ix T4 
polynucleotide kinase buffer (New England Biolabs), followed by labeling, with y-S-ATP 
and T4 polynucleotide kinase and ethanol precipitation. The biotin label was introduced by 
resuspending RNA in 96 |xl of 30 mM MOPS (pH 7.5), 4 p.1 of a 50 mM lodoacetylbiotin 
solution, and incubating at 37°C for 1 hr. RNA was purified using the RNA/DNA Mini Kit 

20 according to manufacturer protocols (QIAGEN). 

To generate cDNA probes, 5 )j,g of total RNA was reverse transcribed using the 
Superscript 11 system for first strand cDNA synthesis (Life Technologies) and 500-ng 
random hexamers. RNA and primers were heated to 70°C and cooled to 25°C; reaction 
buffer was then added, followed by addition of Superscript n and incubation at 42°C. RNA 

25 was removed by RNase H and RNase A. The cDNA was purified using the Qiaquick 
cDNA purification kit (QIAGEN) and fi-agmented by incubation of up to 5 jj,g cDNA and 
0.2 U DNase I for 10 min at 37'C in Ix one-phor-all buffer (Amersham-Pharmacia 
Biotech). The reaction was stopped by incubation for 10 min at 99°C, and fragmentation 
was confirmed on a 0.7% agarose gel to verify that average length fragments were 50-100 

30 nt. Fragmented cDNA was 3 '-end-labeled with terminal transferase (Roche Molecular 
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Biochemicals) and biotin-N6-ddATP (DuPont/NEN) in Ix TdT buffer CRoche Molecular 
Biochemicals) containing 2.5 mM cobalt chloride for 2 hoiirs at 37°C. 

Hybridization to microarrays and staining procedures were done according to the 
Affymetrix expression manual (Affymetrix). The arrays were read at 570 nm with a 
5 resolution of 3 pm using a laser scanner. 

The expression of genes was analyzed using the Affymetrix Microarray Suite 4.01 
software program. Detection of transcripts in intergenic regions was done using the 
int^sities of each probe designed to be a perfect match and the corresponding probe 
designed to be the mismatch. If the perfect match probe showed an intensity that was 200 
10 units higher than the mismatch probe, the probe pair was called positive. Two neighboring 
positive probe pairs were considered evidence of a transcript. The location and length of 
the transcripts were estimated based on the first and last identified positive probe pair 
within an Ig region. 

15 While the present invention has been described in some detail for purposes of 

clarity and understanding, one skilled in the art will appreciate that various changes in form 
and detail can be made without departing from the true scope of the invention. All patents, 
patent applications and publications referred to above are hereby incorporated by reference. 
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WHAT IS CLAIMED IS: 

1. An isolated polynucleotide comprising Candidate #8, or its complement, or 
its homolog having at least about 95% identity thereto. 

2. An isolated polynucleotide comprising Candidate #12, or its complement, or 
its homolog having at least about 95% identity thereto. 

3. An isolated polynucleotide comprising Candidate #14, or its complement, or 
its homolog having at least about 95% identity thereto. 

4. An isolated polynucleotide comprising Candidate #22, or its complement, or 
its homolog having at least about 95% identity thereto. 

5. An isolated polynucleotide comprising Candidate #24, or its complement, or 
its homolog having at least about 95% identity thereto. 

6. An isolated polynucleotide comprising Candidate #25, or its complement, or 
its homolog having at least about 95% identity thereto. 

7. An isolated polynucleotide comprising Candidate #26, or its complement, or 
its homolog having at least about 95% identity thereto. 

8. An isolated polynucleotide comprising Candidate #27, or its complement, or 
its homolog having at least about 95% identity thereto. 

9. An isolated polynucleotide comprising Candidate #3 1 , or its complement, or 
its homolog having at least about 95% identity thereto. 

10. An isolated polynucleotide comprising Candidate #38, or its complement, or 
its homolog having at least about 95% identity thereto. 

1 1 . An isolated polynucleotide comprising Candidate #40, or its complement, or 
its homolog having at least about 95% identity thereto. 

12. An isolated polynucleotide comprising Candidate #41-1, or its complement, 
or its homolog having at least about 95% identity thereto. 

13. An isolated polynucleotide comprising Candidate #41-11, or its complement, 
or its homolog having at least about 95% identity thereto. 

14. An isolated polynucleotide comprising Candidate #43, or its complement, or 
its homolog having at least about 95% identity thereto. 

15. An isolated polynucleotide comprising Candidate #52-1, or its complement, 
or its homolog having at least about 95% identity thereto. 
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16. An isolated polynucleotide comprising Candidate #52-11, or its complement, 
or its homolog having at least about 95% identity thereto. 

17. An isolated polynucleotide comprising Candidate #55-1, or its complement, 
or its homolog having at least about 95% identity thereto. 

18. An isolated polynucleotide comprising Candidate #55-11, or its complement, 
or its homolog having at least about 95% identity thereto. 

19. An isolated polynucleotide comprising Candidate #61, or its complement, or 
its homolog having at least about 95% identity ttiereto. 

20. An isolated polynucleotide comprising Candidate #9, or its complement, or 
its homolog having at least about 95% identity thereto. 

21. An isolated polynucleotide comprising Candidate #17, or its complement, or 
its homolog having at least about 95% identity thereto. 

22. An isolated polynucleotide comprising Candidate #28, or its complement, or 
its homolog having at least about 95% identity thereto. 

23. An isolated polynucleotide comprising Candidate #36, or its complement, or 
its homolog having at least about 95% idmtity thereto. 

24. An isolated polynucleotide comprising Candidate #49, or its complement, or 
its homolog having at least about 95% identity thereto. 

25. An isolated polynucleotide comprising Candidate #50, or its complement, or 
its homolog having at least about 95% identity thereto. 

26. An isolated polypeptide comprising a polypeptide encoded by Candidate #9, 
or its homolog having at least about 95% identity thereto. 

27. An isolated polypeptide comprising a polypq)tide encoded by Candidate 
#17, or its homolog having at least about 95% identity thereto. 

28. An isolated polypeptide comprising a polypeptide encoded by Candidate 
#28, or its homolog having at least about 95% identity thereto. 

29. An isolated polypeptide comprising a polypeptide encoded by Candidate 
#36, or its homolog having at least about 95% identity thereto. 

30. An isolated polypeptide comprising a polypeptide encoded by Candidate 
#49, or its homolog having at least about 95% identity thereto. 
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31. An isolated polypeptide comprising a polypeptide encoded by Candidate 
#50, or its homolog having at least about 95% identity thereto. 

32. An isolated antibody that selectively binds Candidate #8. 

33 . An isolated antibody that selectively binds Candidate #12. 

34. An isolated antibody that selectively binds Candidate #1 4. 

35. An isolated antibody that selectively binds Candidate #22. 

36. An isolated antibody that selectively binds Candidate #24. 

37. An isolated antibody that selectively binds Candidate #25. 

38. An isolated antibody that selectively binds Candidate #26. 

39. An isolated antibody that selectively binds Candidate #27. 

40. An isolated antibody that selectively binds Candidate #3 1 . 

41 . An isolated antibody that selectively binds Candidate #38. 

42. An isolated antibody that selectively binds Candidate #40. 

43. An isolated antibody that selectively binds Candidate #41-1. 

44. An isolated antibody that selectively binds Candidate #41-11 

45 . An isolated antibody that selectively binds Candidate #43 . 

46. An isolated antibody that selectively binds Candidate #52-1. 

47. An isolated antibody that selectively binds Candidate #52-11 

48. An isolated antibody that selectively binds Candidate #55-1. 

49. An isolated antibody that selectively binds Candidate #55-11. 

50. An isolated antibody that selectively binds Candidate #61. 

51 . An isolated antibody that selectively binds Candidate #9, or the polypeptide 
encoded by Candidate #9. 

52. An isolated antibody that selectively binds Candidate #17, or the 
polypeptide encoded by Candidate #17. 

53. An isolated antibody that selectively binds Candidate #28, or the 
polypeptide encoded by Candidate #28. 

54. An isolated antibody that selectively binds Candidate #36, or the 
polypeptide encoded by Candidate #36. 

55. An isolated antibody that selectively binds Candidate #49, or the 
polypeptide encoded by Candidate #49. 
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56. An isolated antibody that selectively binds Candidate #50, or the 
polypeptide encoded by Candidate #50. 

57. Any of the polynucleotides of claims 1 to 19 or the polypeptides of claims 
26 to 31 for use as mediators of cell or intercell regulation. 
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FIG, 2 A 
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FIG. 2D 
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FIG.4A FIG.4B 1 
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SEQUENCE LISTING 



<110> The Government of the United States of America, as represented by 
Secretary, Health and Human Services 
Gottesman, Susan 
Storz, Gisela 
Repoila, Francis 
Wassarman, Karen 
Rosenow, Cars ten 

<:120> IDENTIFICATION OF NEW SMALL RNAs AND 
ORFs 



<130> NIH210.001PCT 



<150> US 60/265402 
<151> 2001-02-01 

<160> 161 



<170> FastSEQ for Windows Version 4.0 



<210> 1 
<211> 93 
<212> DNA 
<213> E. Coli 



<400> 1 

gccccttcaa gagctaagcc actgagagtg ccggagataa gcgccggatg gggtagaaac 60 
ccttaagcct gtgtcgcaca gacttaaggg ttt 93 



<210> 2 
<211> 86 
<212> DNA 
<213> E. Coli 



<400> 2 

tcgctgaaaa acataaccca taaaatgcta gctgtaccag gaaccacctc cttagcctgt 60 
gtaatctccc ttacacgggc ttattt 86 

<210> 3 
<211> 307 
<212> DNA 
<213> E. Coli 



<400> 3 

actgcggccc tttccgccgt ctcgcaaacg 
ccgtaaatgc aggtgtttca cagcgcttgc 
gtgatgcggt cttcgcatgg accgcacaat 
tgtttctggt gcgctgttaa ccgaggtaaa 
actggtggtt aacaaccttc agagcagcaa 
tatttta 



ggcgctggct ttaggaaagg atgttccgtg 60 
tatcgcggca atatcgccag tggtgctgtc 120 
gaagatacgg tgcttttgta tcgtacttat 180 
taataaccgg agtctctccg gcgacaattt 240 
gtaagcccga atgccgccct ttgggcggca 300 

307 
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<211> 65 
<212> DNA 
<213> E. Coli 



<400> 4 

acggcgcagc caagatttcc ctggtgttgg 
ggfcca 

<2io> 5 
<211> 92 
<212> DNA 
<213> E. Coli 



cgcagtattc gcgcaccccg gtctagccgg 60 

65 



<400> 5 

cgcgatcagg aagaccctcg cggagaacct gaaagcacga cattgctcac attgcttcca 60 
gtattactta gccagccggg tgctggcttt tt 92 



<2io> 6 
<211> 211 
<212> DNA 
<213> E. Coli 



<400>. 6 

aacgagtaga tgctcattcc atctcttatg 
gacgcagagc cgtttacggt gcttatcgtc 
acaccatgga cacaacgttg agtgaagcac 
gcctgctccg taataagagc aggcgttttt 



ttcgccttag tgcctcataa actccggaat 60 
cactgacaga tgtcgcttat gcctcatcag 120 
ccacttgttg tcatacagac cfcgttttaac 180 
t 211 



<210> 7 

<211> 141 

<212> DNA 

<213> E. Coli 



<400> 7 

catcaacacc aaccggaacc tccaccacgt gctcgaatga ggtgtgttga cgtcggggga 60 
aaccctcctg tgtaccagcg ggatagagag aaagacaaag accggaaaac aaactaaagc 120 
gcccttgtgg cgctttagtt t 141 

<210> 8 
<211> 79 
<212> DNA 
<213> E. Coli 



<400> 8 

tgccactgct tttctttgat gtccccattt 
tcaaggttga tgggttttt 

<210> 9 

<i211> 272 

<212> DNA 

<213> E. Coli 



tgtggagccc atcaaccccg ccatttcggt 60 

79 



<400> 9 

tgtttaaagc aaaggcgtaa agtagcaccc atagagcgag gacgctaaca ggaacaatga 60 
ctcaggatga gggtcaggag cgccaggagg cgaagacaga ggattgtcag gaagacaaac 120 
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gtccggagac gtaattaaac ggaaatggaa 
agggtgtgtt ggcggcctgc aaggattgta 
aaggcgacag agtaatctgt cgcctttttt 

<210> 10 
<211> 195 
<212> DNA 
<213> E. Coli 



tcaacacgga ttgttcccta aaggaaaaac 180 
agacccgtta agggttatga gtcaggaaaa 240 
ct 272 



<400> 10 

acattgtaaa ccagagttgc gaaggtacaa aaaattaacg ttttagcaat agctatataa 60 

tatagcctgt gctatatctg tatgtaatgc aatcatccct caaggatcga cgggattagc 120 

aagtcaggag gtcttatgaa tgagttcaag aggtgtatgc gcgtgtttag tcattctccc 18 0 
tttaaagtac ggtta 195 

<210> 11 
<211> 82 
<212> DNA. 
<213> E. Coli 

<400> 11 

atcccagagg tattgatagg tgaagtcaac ttcgggttga gcacatgaat tacaccagcc 60 
tgcgcagatg cgcaggtttt tt 82 

<210> 12 

<211> 92 

<212> DNA 

<213> E. Coli 

<400> 12 

atcccagagg tattgattgg tgagattatt cggtacgctc tcttcgtacc ctgtctcttg 60 
caccaacctg cgcggatgcg caggtttttt tt 92 

<210> 13 
<211> 278 
<212> DNA 
<213> E. Coli 



<400> 13 

actataaagt cagcgaagga aatgcttctg 
gattcctgta ttcggtccag ggaaatggct 
cattaatgca ggcttagttg ccttgccctt 
gtttgcgtgc aaaatggtca ataaaaagcg 
gcccgttctg gtgaaagaac tgaggcggtt 

<210> 14 
<211> 105 
<212> DNA 
<213> E- Coli 



gcttttaaca gataaaaaga gaccgaacac 60 

cttgggagag agccgtgcgc taaaagttgg 12 0 

taagaataga tgacgacgcc aggttttcca 180 

tggtggtcat cagctgaaat gttaaaaacc 24 0 
tttttatt 27 8 



<400> 14 

agggcaaggc aactaagcct gcattaatgc caacttttag cgcacggctc tctcccaaga 60 
gccatttccc tggaccgaat acaggaatcg tgttcggtct ctttt i05 



<210> 15 
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<211> 144 
<212> Dim 
<213> E. Coli 

<400> 15 

agtgagggtt agggagaggt ttccccctcc ccctggrtgtt cttagtaagc ctggaagcta 60 

atcactaaga gtatcaccag tatgatgacg tgcttcatca taaccctttc cttattaaaa 12 0 

gccctcttct ccgggagagg cttt 144 

<210> 16 
<211> 137 
<212> DNA 
<213> E. Coli 

<400> 16 

agtgagggta gagcggggtt tcccccgccc tggtagtctt agtaagcggg gaagcttatg 60 
actaagagca ccacgatgat gagtagcttc atcatgaccc tttccttatt tatggcccct 12 0 
tcctcgggag gggcttt 137 

<210> 17 

<211> 112 

<212> DNA 

<213> E. Coli 

<400> 17 

aggaacaagg gtaagggagg atttctcccc cctctgattg gctgttaata agctgcgaaa 60 
ct.t:acgagt:a acaacacaat cagtatgatg acgagcttca tcataaccct tt 112 

<210> 18 
<211> 13 9 
<212> DNA 
<213> E. Coli 

<400> 18 

cagggcaata tctctcttgc aggtgaatgc aacgtcaagc gatgggcgtt gcgctccata 60 

ttgtcttact tccttttttg aattactgca tagcacaatt gattcgtacg acgccgactt 120 

tgatgagtcg gcttttttt 13 9 

<210> 19 

<211> 155 

<212> DNA 

<213> E. Coli 

<400> 19 

tagagtaaag gaacaagggt aagggaggat ttctcccccc tctgattggc tgttaataag 60 

ctgcgaaact tacgagtaac aacacaatca gtatgatgac gagcttcatc ataacccttt 12 0 

ccttctgtaa ggcccccttc ttcgggaggg gcttt 155 

<210> 20 
<211> 128 
<212> DNA 
<213> E. Coli 

<400> 20 

cataggggca atgataaaag gtggcaaaaa tgaatgtttc cagtagaact gtagtactga 60 



- 4 - 



wo 02/060914 



PCT/US02/03147 



taaatttctt tgctgctgtt ggtttgttta ctcttatctc tatgagattt ggctggttta 120 
tttgatgt 128 

<210> 21 
<211> 31 
<212> PRT 
<213> E. Coli 

<400> 21 

Met Asn Val Ser Ser Arg Thr Val Val Leu He Asn Phe Phe Ala Ala 

15 10 15 

Val Gly Leu Phe Thr Leu He Ser Met Arg Phe Gly Trp Phe He 
20 25 30 



<210> 22 
<211> 84 
<212> DNA 
<213> E. Coli 

<40O> 22 

ataattataa gagaggttgt tatgattgaa cgtgaactgg ggaactggaa agactttatc 60 
gaagttatgc ttcgtaagta attc 84 

<210> 23 
<211> 19 
<212> PRT 
<213> E, Coli 

<400> 23 

Met He Glu Arg Glu Leu Gly Asn Trp Lys Asp Phe He Glu Val Met 

15 10 15 

Leu Arg Lys 



<210> 24 
<211> 180 
<212> DNA 
<213> E. Coli 



<40O> 24 

aaaggagacg cttatgtttc gttggggcat 
cgcacttggg tttggtggtc tggccggtac 
cgtcgggatt attctgttcc tggtgagttt 



catatttctg gttatcgcgt taatcgccgc 60 
cgctgcaggc gcagctaaaa ttgtctttgt 12 0 
gttcatgggc cgaaaacgac cctagatttc 180 



<210> 25 
<211> 53 
<212> PRT 
<213> E. Coli 

<400> 25 

Met Phe Arg Trp Gly He He Phe Leu Val He Ala Leu He Ala Ala 
15 10 15 
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Ala I*eu Gly Phe Gly Gly lieu Ala Gly Thr Ala Ala Gly Ala Ala Ixys 

20 25 * 30 

lie Val Phe Val Val Gly lie lie Leu Phe Leu Val Ser Leu Phe Met 

35 40 45. 

Gly Arg Lys Arg Pro 
50 



<210> 26 
<211> 226 
<212> DNA 
<213> E. Coli 



<400> 26 

atacggagat atcatcatgg 
cgtagatttc atggcatcaa 

taacgcgatt ccgtccggaa 
gtattatcgt cggctttatc 

<210> 27 

<211> 69 

<212> PRT 

<213> E. Coli 



gcaaattagg tgaaaacgtt 
gccaggcgtt ccgggagtat 

tacccgatga aagcgtgccg 
ggccgaagca ggtagagggg 



ccgcttctta tcgabaaagc 60 
ctgaaaaaac ttcctccccg 12 0 
ttatatctac aacgtctgga 180 
cagtaa 226 



<400> 27 

Met Gly Lys Leu Gly Glu Asn Val Pro Leu Leu lie Asp Lys Ala Val 

15 10 15 

Asp Phe Met Ala Ser Ser Gin Ala Phe Arg Glu Tyr Leu Lys Lys Leu 

20 25 30 

Pro Pro Arg Asn Ala lie Pro Ser Gly lie Pro Asp Glu Ser Val Pro 

35 40 45 

Leu Tyr Leu Gin Arg Leu Glu Tyr Tyr Arg Arg Le\i Tyr Arg Pro Lys 

50 55 60 

Gin Val Glu Gly Gin 
65 



<2a0> 28 
<211> 189 
<212> DNA 
<213> E. Coli 



<400> 28 

gagtagttaa catgaagcgg agtagaacgg 
ctzagccgacg taaatcgcgt tggcttgagg 
gcatcaggaa gtgcattcta aacaaacagc 
tctaaatgt 



aagtggggcg ctggcgcatg cagcgtcagg 60 
ggcaatcgcg ccgaaatatg cgtatccaca 120 
gtaactcgtt attgtttgcg atctacaata 180 

189 



<2ao> 29 

<211> 57 
<212> PRT 
<213> E. Coli 



<400> 29 

Met Lys Arg Ser Arg Thr Glu Val Gly Arg Trp Arg Met Gin Arg Gin 
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1 

Ala Ser Arg Arg 

20 

Met Arg lie His 
35 

Ser Leu Leu Phe 
SO 



5 

Lys Ser Arg Trp 

Ser lie Arg Lys 
40 

Ala lie Tyr Asn 
55 



10 

Leu Glu Gly Gin 

25 

Cys lie Leu Asn 
lie 



15 

Ser Arg Arg Asn 
30 

Lys Gin Arg Asn 
45 



<210> 30 

<211> 117 

<212> DNA 

<213> E. Coli 

<400> 30 

aacggaggca aataatgctg ggtaatatga atgtttttat ggccgtactg ggaataattt 60 
tattttctgg ttttctggcc gcgtatttca gccacaaatg ggatgactaa tgaacgg 117 

<210> 31 

<211> 31 

<212> PRT 

<213> E. Coli 

<4Q0> 31 

Met Leu Gly Asn Met Asn Val Pile Met Ala Val Leu Gly He He Leu 

15 10 15 

Phe Ser Gly Phe Leu Ala Ala Tyr Phe Ser His Lys Trp Asp Asp 
20 25 30 



<210> 32 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 32 

gcgcctcgtt atcatccaaa atacg 25 

<210> 33 

<211> 25 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 33 

gtcgcccagc caatgctttc agtcg 25 

<210> 34 
<211> 25 
<212> DNA 
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<213> Artificial Sequence 



<220> 

<223> Oligonucleotide 



<400> 34 

attgatcgca cacctgacag ctgcc 



25 



<210> 35 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 35 

gttgtcaccc tggacctggt cgtac 25 

<210> 36 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 



<210> 37 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 37 

actcttaaat ttcctatcaa aactcgc 27 

<210> 38 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 



<40p> 36 

tgaccgcgat ttgcacaaaa ' tgc 



23 



<400> 38 

ggtattttca gagattatga attgccg 



27 



<210> 39 
<211> 25 
<212> DNA 



- 8 - 
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<2a3> Artificial Sequence 



<220> 

<223> Oligonucleotide 



<400> 39 

tcacctctcc ttcgagcgct actgg 



25 



<210> 40 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 40 

aatgctctcc tgataatgtt aaactt 26 

<210> 41 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 



<210> 42 
<21X> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 

<400> 42 

taattccttt caaatgaaac ggagc 25 

<210> 43 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 



<400> 41 

ggttagctcc gaagcaaaag ccggat 



26 



<400> 43 

ggactccctc attataatta ctgg 



24 



<210> 44 
<211> 27 
<:212> DNA 
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<213> Artificial Sequence 



<220> 

<223> Oligonucleotide 



<400> 44 

ctccttaaac aaggacatta gtctacg 



27 



<210> 45 
<211> 27 
<212> DNA 

<213> Artificial Secjuence 
<220> 

<223> Oligonucleotide 
<400> 45 

attcacctta /cctaatttga ttcttcc 27 

<210> 46 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 



<210> 47 
<211> 2€ 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 47 

gtcggcgtcg tacgaatcaa ttgtgc 26 

<210> 48 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 



<400> 46 

ccatcgcttg acgttgcatt cacctgc 



27 



<400> 48 

gcacaattga ttcgtacgac gccgac 



26 



<210> 49 
<211> 25 
<212> DNA 



- 10 - 



wo 02/060914 



PCT/US02/03147 



<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 49 

taaggataat attgcagatc gtaag 25 

<210> 50 
<211> 22 
<212> DNA 

<£213> Artificial Sequence 
<:220> 

<223> Oligonucleotide 
<400> 50 

atcatcaaac agcaacttgc cc 22 

.<210> 51 
<211> 26 
<212> DNA 

<213> j^rtificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 51 

tgtccttctc ctgcaagaga attatt 26 

<210> 52 
<211> 26 

<:212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 52 

gctaataata atgtcttttt cgctcc 26 

<210> 53 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 

<400> 53 

gcttttgtga attaatttgt atatcgaagc g 31 

<210> 54 
<211> 28 
<212> DNA 
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<213> Artificial Sequence 



<220> 

<223> Oligonucleotide 



<400> 54 

tattaatacc ctctagattg agttaatc 



28 



<210> 55 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 55 

cgatttacct cacttcatcg ctttcag 27 

<210> 56 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 56 

tgatcctgac ttaatgccgc aagttc 26 

<210> 57 

<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 



<210> 58 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 58 

ttgctcacat ctcactttaa tcgtgctc 28 

<210> 59 
<211> 34 
<212> DNA 



<400> 57 

gcttatctcc ggcactctca gtggcttagc tcttgaagg 



39 



0 
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<213> Artificial Sequence 



<:220> 

<223> Oligonucleotide 



<400> 59 

atattccacc agctatttgt tagtgaataa aagg 



34 



<210> 60 
<211> 30 
<212> DNA 

<213> T^tificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 60 

tgattaattt cgattatttt tcccggatgg 30 

<210> 61 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 



<210> 62 
<2ai> 30 
<212> DNA 

<213> Artificial Secjuence 
<220> 

<223> Oligonucleotide 
<400> 62 

ttattttccc cggaagcaca ttcacttcac 30 

<210> 63 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 63 

tgatctattg cacaacgagg aagc 24 

<210> 64 
<211> 30 
<212> DNA 



<400> 61 

attagaaaca ggaagcccct cagtcgag 



28 
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<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 64 

tgcttactca tcaaaagtag cgccagattc 30 

<210> 65 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 65 

taatcgacgg acgatagata attcctg 27 

<210> 66 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 



<210> 67 

<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 

<:400> 67 

cgatttatga gaataaatac tcatttaagg gtg 33 

<210> 68 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 



<400> 66 

ccaatgtgtc gcctttttca actttccg 



26 



<400> 68 

aaatccgact ttagttacaa catac 



25 



<210> 69 
<211> 26 
<212> DNA 



- 14 - 



wo 02/060914 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 69 

gaccagacct tcttgatgat gggcac 

<210> 70 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 70 

cgacctcaat tccacgggat ctgg 

<210> 71 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 71 

atttagctgt agtaatcact cgccg 

<210> 72 
<211> 23 
<212> DNA 

<213> T^tificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 72 

ggtctcctta gcgccttatt gcg 

<210> 73 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 73 

cgcccacatg ctgttcttat tattccc 

<210> 74 
<211> 24 
<212> DNA 



PCT/US02/03147 



26 



24 



25 



23 
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<213> Artificial Sequence 



*:220> 

<223> Oligonucleotide 



<400> 74 

tttatgacac ctgccactgc cgtc 



24 



<210> 75 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 75 

ctgtcaagtt atctgtttgt taagtcaagc 30 

<210> 76 
<211> 2 6 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 



<210> 77 

<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 

<400> 77 

gctgtgaaac acctgcattt acggccacgg 30 

<210> 78 
<2ai> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 



<400> 76 

gctgtgaagc acctgcgttg ctcatg 



26 



<400> 78 

ccgtggccgt aaatgcaggt gtttcacagc 



30 



<210> 79 
<211> 23 
<212> DNA 
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<213> Artificial Sequence 



<220> 

<223> Oligonucleotide 



<400> 79 

cctttcgcaa ttgactgaaa cac 



23 



<210> 80 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 80 

ggctagaccg gggtgcgcg 19 

<210> 81 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 



<210> 82 
<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 82 

gtcctctttg gggtaaatgt c 21 

<210> 83 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 



<400> 81 

aaggtggtta tttacacctt agcg 



24 



<400> 83 

aatgctccgg tttcatgtca tc 



22 



<210> 84 
<211> 20 
<212> DNA 
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<213> Artificial Sequence 



<220> 

<223> Oligonucleotide 



<400> 84 

tagttccttc tcacccggag 



20 



<210> 85 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 

<400> 85 

cacaagggcg ctttagtttg ttttccg 27 

<210> 86 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 



<210> 87 

<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 87 

taattcgtcg taattcgtcc tec 23 

<210> 88 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 



<400> 86 

atcccctgag agtttaattt tcgtcaag 



28 



<400> 88 

ctctgccttc ctgtttttgt tgtg 



24 



<210> 89 
<211> 30 
<212> DNA 
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<2a3> Artificial Sequence 



<220> 

<223> Oligonucleotide 



<400> 89 

aaacgcattt gcaactgtcg gcgcttttcc 



30 



<210> 90 
<2ll> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 90 

cttgttacct caaaaaatca cagtgctcg 29 

<210> 91 
<211> 27 
<2a2> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 



<210> 92 
<211> 32 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 92 

gtttttttac gggtaagccg caacgaccat tg 32 

<210> 93 

<211> 22 
<212> DNA 

<213> Artificial. Sequence 
<220> 

<223> Oligonucleotide 



<400> 91 

gcagtcggtg atgctggatt tgccctg 



27 



<40Q> 93 

tagtagataa gttttagata ac 



22 



<210> 94 
<211> 25 
<212> DNA 
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<213> Artificial" Sequence 



<220> 

<223> Oligonucleotide 



<400> 94 

taaaactgaa gttgccctga aaatg 



25 



<210> 95 
<211> 22 
<212> DNA 

<213> Artificial Secpience 
<220> 

<223> Oligonucleotide 
<400> 95 

tgatgagtgg ttctgcaaga gg 22 

<210> 96 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 



<210> 97 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<220> - 

<223> Oligonucleotide 
<400> 97 

cggactacct caaaataaag ctttatatac g 31 

<210> 98 
<211> 31 
<212> DNA 

<:213> Artificial Sequence 

<220> 

<223> Oligonucleotide 



<400> 96 

taaaagacag attacctggc ctg 



23 



<400> 98 

gtcatgatac cttgattaaa aaacaaacag c 



31 



<210> 99 
<211> 26 
<212> DNA 
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wo 02/060914 



PCT/US02/03147 



<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 99 

ggctataatg cgcacataac ctcttg 26 

<210> 100 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 100 

aatcttttct tattttttgg ctaacgaata gcc 33 

<210> 101 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 

<400> 101 

gtccaacttt ttggggtcag tacaaacttt g 31 

<210> 102 
<211> 2 8 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 102 

taataacgcc gttattaaat agcctgcc 2 8 

<210> 103 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<40p> 103 

taagcaacgt ctgcttactg cccctc 2 6 

<210> 104 
<211> 33 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 104 

gtgatggctt ctgataaaga taaatttata gcc 

<210> 105 
<211> 18 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Oligonucleotide 
<400> 105 

taacaggcta agaggggc 18 

<210> 106 
<21X> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 106 

attgccactc ttcttgatca aataaccg 28 

<210> 107 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 107 

aatgcgtctg ttgataattc aaattagtc 29 

<210> 108 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 108 

tagccgtttt attcagtata gatttgcg 28 



<210> 109 
<211> 23 
<212> DNA 



- 22 - 
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<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 109 

gttcgtcggt aacccgtttc age 

<210> 110 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 110 

atggcttaaa gagaggtgcc 

<210> 111 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 

<400> 111 

cgtactttaa agggagaatg ac 

<210? 112 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 112 

gtgcttcctc attatggtga eg 

<210> 113 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> . Oligonucleotide 
<400> 113 

gaatggaggg agattacacg 

<210> 114 
<211> 21 
<212> DNA 



PCT/US02/03147 
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20 



22 



22 
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<213> Artificial Sequence 

<220> 

<223> Oligonucleotide 
<400> 114 

cc ttagtggg taaacgctta c 

<210> 115 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 115 

ctttcaggca gctaaggaaa g 

<210> 116 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 116 

caatatgtat tattgattga gtaaacggg 

<210> 117 
<211> 20 
<212> DNA 

<213> Artificial Sec[uence 
<220> 

<223> Oligonucleotide 
<400> 117 

cctcttccag gaataatccc 

<210> 118 
<211> 20 
<212> DNA 

<:213> Artificial Sequence 
<220> 

<22 3> Oligonucleotide 
<400> 118 

cggaaagcgg ttcacagatc 



<210> 119 
<211> 23 
<212> DNA 



PCT/US02/03147 



21 



21 



29 



20 



- 24 - 
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<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 119 

ctcgtaagtt tcgcagctta tta 

<210> 120 
<211> 20 
<212> DNA 

<213:> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 120 

tgaaattcct gtccgacagg 

<210> 121 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 121 

gcactaccgc aatgttattg c 

<210> 122 
<211> 24 - 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 122 

gcttacccaa taaatagtta cacg 

<210> 123 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 123 

taaaacctgt cacaaatcac aaa 

<210> 124 
<211> 21 
<212> DNA 



PCT/US02/03147 



23 



20 



21 



24 



- 25 - 
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<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 124 

gtggcctgct tcaaactttc g 

<210> 125 
<2ll> 23 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Oligonucleotide 
<400> 125 

gtaaagtcta gcctggcggt teg 

<210> 126 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 126 

taattetggt acgcctggca gatattttgc c 

<210> 127 
<211> 24 
.<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 127 

atcaacctca aaagggaaat cggg 

<210> 128 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 128 

taacttgttg taagccggat egg 

<210> 129 
<211> 23 
<212> DNA 



PCT/US02/03147 



21 



23 



31 



24 



26 - 
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<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 129 

tgaagcatct atcgccggtt gcg 

<210> 130 
<211> 2 8 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 130 

gattagaaat ccttttgaaa gcgcattg 

<210> 131 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 131 

cttattgggc accgcaatgg 

<210> 132 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 

<400> 132 

cgaacacaat aaagatttaa ttcagcc 

<210> 133 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 133 

ctgatgctac tgtgtcaacg 

<210> 134 
<211> 22 
<212> DNA 



PCT/US02/03147 
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28 



20 



27 
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<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 134 

aataatcaga catagcttag gc 

<210> 135 
<211> 21 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Oligonucleotide 
<400> 135 

gccgtgatgg ttttcgcgtt c 

<210> 136 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 136 

tattttcctc ccgcgctaaa g 

<210> 137 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 137 

ttcagctgat gaccaccacg ctt 

<210> 138 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 138 

gagttgtcag agcaggatga ttc 

<210> 139 
<211> 22 
<212> DNA 



PCT/US02/03147 



22 



21 



21 



23 
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<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 139 

tatctgcgct tatcctttat gg 

<210> 140 
<211> 23 
<212> DKA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 140 

cctttacggt gataaccgtc gcg 

<210> 141 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 141 

ctgacaagcc tctcattctc ttgtc 

<210> 142 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 142 

gagaattatc gaggtccggt ate 

<210> 143 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<:223> Oligonucleotide 
<400> 143 

ctacgcgtta gcgatagact gc 

<210> 144 
<211> 31 
<212> DNA 



PCT/US02/03147 



22 



23 



25 



23 
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<213> Artificial Sequence 



<220> 

<223> Oligonucleotide 



<400> 144 

aggcttacta agaacaccag ggggagggga a 



31 



<210> 145 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<4O0> 145 

agtcataagc ttccccgctt actaagacta 30 

<210> 146 
<211> 22 
<212> DNA 

<213> Artificial Sequence 



<210> 147 

<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 147 

taaacaccgt cgtcagaaat gc ^ 22 

<210> 148 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Oligonucleotide 
<400> 148 

tagactttta tccactttat tgctg 25 



<220> 

<223> Oligonucleotide 



<400> 146 

cctcaaatcg gccataataa cc 



22 



<:210> 149 
<211> 25 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 149 

gtgtgccttt cggcgatatg gcgtg 

<210> 150 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 150 

cctttacgtg ggcggtgatt ttgtc 

<2io> 151 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 

<400> 151 

tagctttgct cctggatgtt tgcc 

<210> 152 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 152 

gctgtaattt attcagcgtt tgtacatacg 

<210> 153 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 

<400> 153 

tcagtcaact cgctgcggcg tgttac 

<210> 154 
<2ai> 28 
<:212> DNA 
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<213> Aartificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 154 

cttattgttg cttagttagg gtagtcac 

<210> 155 
<211> 26 
<212> DNA 

<213> Artificial Secpience 
<220> 

<223> Oligonucleotide 
<400> 155 

cagtcagtct caggggagga gcaatc 

<210> 156 
<211> 30 
<212> DNA 

<213> Axtificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 156 

tgaatgcaca ataaaaaaat cccgaccctg 

<210> 157 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 157 

agtcgcgcag tactcctctt accag 

<210> 158 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 158 

taatttctca tcaggcggct ctgc 

<210> 159 
<211> 24 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 159 

taacattatc agcctgctga cggc 24 

<210> 160 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 160 

ggccgaattc gtagggtaca gaggtaag 2 8 

<210> 161 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide 
<400> 161 

ggccggatcc gtcattactg actggggcgg 3 0 
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(57) Abstract: The invention relates to small RNAs and ORFs of E. coli as mediators of cell and intercell regulation. 
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This International Searching Authority found multiple (groups of) 
inventions in this international application, as follows: 

1. Claims: 1. 32 and partially 57 

Polynucleotide comprising Candidate #8; antibody binding to 
it and use of the polynucleotide as mediator of cell or 
intercell regulation 

2. Claims: 2, 33 and partially 57 

Polynucleotide comprising Candidate #12; antibody binding to 
it and use of the polynucleotide as mediator of cell or 
intercell regulation 



3. Claims: 3» 23 and partially 57 

Polynucleotide cOTiprising Candidate #14; antibody binding to 
it and use of the polynucleotide as mediator of cell or 
intercell regulation 



4. Claims: 4, 35 and partially 57 

Polynucleotide comprising Candidate #22; antibody binding to 
it and use of the polynucleotide as mediator of cell or 
intercell regulation 



5. Claims: 5. 36 and partially 57 

Polynucleotide comprising Candidate #24; antibody binding to 
it and use of the polynucleotide as mediator of cell or 
intercell regulation 



6. Claims: 6, 37 and partially 57 

Polynucleotide comprising Candidate #25; antibody binding to 
it and use of the polynucleotide as mediator of cell or 
intercell regulation 



7. Claims; 7, 38 and partially 57 

Polynucleotide comprising Candidate #26; antibody binding to 
it and use of the polynucleotide as mediator of cell or 
intercell regulation 



8. Claims: 8, 39 and partially 57 

Polynucleotide comprising Candidate #27; antibody binding to 
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it and use of the polynucleotide as mediator of cell or 
intercell regulation 



9. Claims: 9, 40 and partially 57 

Polynucleotide comprising Candidate #31; antibody binding to 
it and use of the polynucleotide as mediator of cell or 
intercell regulation 



10. Claims: 10. 41 and partially 57 

Polynucleotide comprising Candidate #38; antibody binding to 
it and use of the polynucleotide as mediator of cell or 
intercell regulation 



11. Claims: 11, 42 and partially 57 

Polynucleotide comprising Candidate #40; antibody binding to 
it and use of the polynucleotide as mediator of cell or 
intercell regulation 



12. Claims: 12. 43 and partially 57 

Polynucleotide comprising Candidate #41-1; antibody binding 
to it and use of the polynucleotide as mediator of cell or 
intercell regulation 



13. Claims: 13. 44. and partially 57 

Polynucleotide comprising Candidate #41-11; antibody binding 
to It and use of the polynucleotide as mediator of cell or 
intercell regulation 



14. Claims: 14. 45 and partially 57 

Polynucleotide comprising Candidate #43; antibody binding to 
it and use of the polynucleotide as mediator of cell or 
intercell regulation 



15. Claims: 15. 45 and partially 57 

Polynucleotide canprising Candidate #52-1; antibody binding 
to it and use of the polynucleotide as mediator of cell or 
intercell regulation 



16. Claims: 16. 47 and partially 57 
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Polynucleotide comprising Candidate #52-11; antibody binding 
to it and use of the polynucleotide as mediator of cell or 
intercell regulation 



17. Claims: 17, 48 and partially 57 

Polynucleotide comprising Candidate #55-1; antibody binding 
to it and use of the polynucleotide as mediator of cell or 
intercell regulation 



18. Claims: 18, 48 and partially 57 

Polynucleotide comprising Candidate #55-11; antibody binding 
to it and use of the polynucleotide as mediator of cell or 
intercell regulation 



19. Claims: 19, 50 and partially 57 

Polynucleotide comprising Candidate #61; antibody binding to 
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intercell regulation 
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Polynucleotide comprising Candidate #36; polypeptide encoded 
by candidate #36; antibody binding to it and use of the 
polynucleotide as mediator of cell or intercell regulation 



24. Claims: 24, 30, 55 and partially 57 
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Polynucleotide comprising Candidate #49; polypeptide encoded 
by candidate #49; antibody binding to it and use of the 
polynucleotide as mediator of cell or intercell regulation 
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Polynucleotide comprising Candidate #50; polypeptide encoded 
by candidate #50; antibody binding to it and use of the 
polynucleotide as mediator of cell or intercell regulation 
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(57) Abstract: The invention relates to small RNAs and ORFs of E. coli as mediators of cell and intercell regulation. 



